<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Gowers&#039;s Weblog</title>
	<atom:link href="http://gowers.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://gowers.wordpress.com</link>
	<description>Mathematics related discussions</description>
	<lastBuildDate>Mon, 30 Jan 2012 11:48:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='gowers.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Gowers&#039;s Weblog</title>
		<link>http://gowers.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://gowers.wordpress.com/osd.xml" title="Gowers&#039;s Weblog" />
	<atom:link rel='hub' href='http://gowers.wordpress.com/?pushpress=hub'/>
		<item>
		<title>What&#8217;s wrong with electronic journals?</title>
		<link>http://gowers.wordpress.com/2012/01/29/whats-wrong-with-electronic-journals/</link>
		<comments>http://gowers.wordpress.com/2012/01/29/whats-wrong-with-electronic-journals/#comments</comments>
		<pubDate>Sun, 29 Jan 2012 15:41:16 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Mathematics on the internet]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3940</guid>
		<description><![CDATA[It probably sounds disingenuous of me to say this, but when I sat down to write a post about Elsevier I wasn&#8217;t really trying to start a campaign. My intention was merely to make public, and a little more rigid, a policy that I and many others had already been applying, in my case without [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3940&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>It probably sounds disingenuous of me to say this, but when I sat down to write a post about Elsevier I wasn&#8217;t really trying to start a campaign. My intention was merely to make public, and a little more rigid, a policy that I and many others had already been applying, in my case without much difficulty, for several years. The idea of setting up a website occurred to me as I was writing the post: I considered it (and still consider it) not as a petition to Elsevier to change its ways &#8212; since I don&#8217;t believe there is any realistic chance of that &#8212; but as a simple way to bring out into the open all the private boycotts and semi-boycotts that were going on, and thereby to encourage others to do the same. </p>
<p>By accident, the post seems to have been quite well timed. Probably it&#8217;s not an accident at all, and that whatever atmosphere it was that prompted me to get round to writing the post (for example, certain discussions I had had with other mathematicians, some of them online) was the same as what made it a good moment. Anyhow, accident or no, the result is that some people have talked about &#8220;momentum&#8221;, and I&#8217;m starting to feel a responsibility, not particularly welcome (because it threatens to involve work), not to squander that momentum.<br />
<span id="more-3940"></span></p>
<p>I&#8217;ve actually been ill in bed for much of the last few days, so most of the rest of this post will be reporting on some feverish thoughts, which I&#8217;ll try to organize into a more coherent form. I&#8217;ll also try not to write too much, though that may be quite difficult.</p>
<p><strong>What next?</strong></p>
<p>What I really mean is more like, &#8220;How much next?&#8221; Do we just let the number of signatures at <a href="http://thecostofknowledge.com/" target="_blank">Tyler Neylon&#8217;s website</a> continue to grow at its currently healthy rate and sit back and hope that at some point there will be a phase change? That was something like my original plan &#8212; or rather non-plan. But there are reasons to suppose that provoking a phase change will take a bit more effort.  </p>
<p>I felt I had at least to think about that when <a href="http://gowers.wordpress.com/2012/01/23/httpthecostofknowledge-com/#comment-14789">Michael Harris made a comment</a> of which here is the beginning.</p>
<blockquote><p>When the number of signatures reaches a certain target figure — 500, say, or 1000 — the next step is to send an open letter to the members of the editorial board of one of the Elsevier journals, explaining why they might want either to look into changing publishers or, if this is impossible for contractual reasons, to resign. Since the editors are colleagues, the tone should not be confrontational. Instead, one should make the point that their remaining on the editorial board in the face of such a massive show of rejection will naturally be interpreted as a defense of Elsevier’s business practices; and more pragmatically, it will be more difficult to maintain the quality of a journal subject to boycott.</p>
<p>I’m willing to draft such a letter if there is sufficient interest and if no one else volunteers, though I’m hardly the most qualified to do so. It would need at least 20 signatures from a broad sampling of mathematical specialties.</p></blockquote>
<p>My initial impulse on reading this was to think that maybe that was moving a bit fast. I also latched on eagerly to the words &#8220;the tone should not be confrontational&#8221; and started mentally drafting letters full of assurances that they were not in any sense a criticism etc. etc. Meanwhile, it soon became clear that the 1000-signatures mark would be quickly passed, as it now has been. (However, the proportion of mathematicians has dropped. For a while it was almost 100% but now it is a lot less than that. So a target that might be appropriate is 1000 mathematicians. Restricting the list by subject is not yet possible, but Tyler Neylon assures me that it will become so. With a bit of effort, I&#8217;ve done a not terribly reliable count and concluded that there are 430 mathematicians so far.) </p>
<p>I then read this (written, as you can see, in response to <a href="http://gowers.wordpress.com/2012/01/21/elsevier-my-part-in-its-downfall/#comment-14659">another comment</a>).</p>
<blockquote><p>Stan,</p>
<p>We agree that technology is making publishing an electronic journal easy without technical expertise.</p>
<p>A group of current UChicago and forner grad students and alums have created Scholastica, (http://www.scholasticahq.com), an academic journal management platform and scholarly community. Anyone can create their own peer reviewed journal, manage their peer review process, and ultimately publish without the need for publishing companies like Elsevier. There&#8217;s also a section of the application called &#8216;The Conversation&#8217; (http://scholasticahq.com/conversation) that is very similar to Mathoverflow that allows academics to build reputation points that can be used to be recruited as a referee.</p>
<p>We hope that this is seen as more than a shameless plug as we&#8217;ve been working tirelessly over the last year with no pay to provide something to address the problems with academic publishing that Tim and others describe here.</p>
<p>We would love your support.</p>
<p>- Rob Walsh<br />
Scholastica</p></blockquote>
<p>A little later, I had an exchange of emails with Brian Cody, another member of the Scholastica team, and it became clear that one of their aims was to make it almost effort free for the editors of a journal to do what the editors of Topology did: resign en masse and start again somewhere else with a modified name. Scholastica may well not be the only venture of its kind, and perhaps one can argue about whether it is the best, but what one can say now, with confidence, is that there is a web tool out there that makes the mechanics of starting up a new (but secretly not so new) journal almost trivial. I&#8217;d add that the site is in beta at the moment, with an eager team of developers who are ready to add features if there is a demand for them. I urge people to have a look.</p>
<p>It seems to me that if lots of mathematicians feel that enough is enough with Elsevier, and if it is easy to move a journal, then one really can start to think that something might happen sooner rather than later. But there is one snag, which brings me to the title of this post: a journal set up with Scholastica is electronic. [I write that without being 100% certain that it is correct -- I have written to them to check.]</p>
<p><strong>Electronic Journals.</strong></p>
<p>What&#8217;s wrong with that, you might ask? I don&#8217;t have a good answer, but I do have a bad answer, which is that I, and probably many other people, have an irrational prejudice against them. (There&#8217;s also a potentially better answer to do with whether electronic archives are likely to be as durable as paper ones have shown themselves to be, but I&#8217;m going to ignore that issue.) I grew up with the paper journal, I remember the thrill of seeing my first paper <em>in print</em>, I enjoyed browsing in libraries, I liked the long traditions that accompanied certain journals, and so on, and when the first electronic journals started, there just didn&#8217;t seem to be any point in submitting to them: why sacrifice that lovely paper when you didn&#8217;t have to? Somehow, electronic journals weren&#8217;t the real thing.</p>
<p>Recently, however, my prejudice has weakened. An obvious reason is that I don&#8217;t actually have any of the experiences that I enjoyed when I was starting out in my career: I can&#8217;t remember when I last set foot in a maths library, I think people have stopped sending me fifty offprints whenever a paper of mine comes out (which is a relief, as the ones I do have are a silly waste of shelf space, though I can&#8217;t bear to throw them away), the moment a paper &#8220;comes out&#8221; is nowadays the day I put it on the arXiv rather than the almost irrelevant day a couple of years later when it is published. In short, I do pretty well everything on my computer these days, so the idea of an electronic publication has lost the &#8220;unreal&#8221; feeling it used to have. </p>
<p>However, I do think that kind of prejudice probably still survives to a significant extent, and that it would be good to try to combat it. Here it seems to me that electronic journals have missed a trick. When I see the name &#8220;Electronic Journal of Combinatorics&#8221;, for example, my instinct is to read it as something like, &#8220;Journal of Combinatorics &#8212; except it&#8217;s only electronic&#8221;. In other words, the word &#8220;electronic&#8221; has entirely negative associations. (At this point I should say that yesterday out of curiosity I browsed the archive of the Electronic Journal of Combinatorics for the first time ever, and discovered to my surprise, and slight shame, that it was full of excellent papers by excellent mathematicians. Moreover, in the sample I looked at every single paper made me think, &#8220;Hmm, that looks interesting.&#8221; By way of apology, I shall submit to them when I next have a suitable paper. I was also shocked to discover that <a href="http://www.math.upenn.edu/~wilf/">Herb Wilf</a>, who founded the journal, died a few weeks ago. That news had passed me by.)</p>
<p>There must surely be ways that an electronic journal could exploit its electronic character in order to have a <em>positive</em> appeal. Why not have an electronic journal that isn&#8217;t run on quite the same lines as a conventional journal? Let me describe an imaginary new journal that would be close enough to conventional journals not to ruffle too many feathers but different enough that at least some people might find it dynamic, forward-looking, and somewhere one would love to be published. </p>
<hr />
<p><strong>Breakthroughs in Mathematics.</strong></p>
<p>The journal Breakthroughs in Mathematics is set up with one main aim: to accept papers only if they are outstanding. As its name suggests, the editors will be looking for papers that open up new areas, get past seemingly impregnable barriers, or solve long-standing open problems.</p>
<p>If you have written such a paper, why might you wish to submit it to Breakthroughs rather than to, say, Annals, Acta or the Journal of the AMS? Here are a few reasons.</p>
<p>1. Our attitude is that if you publish with us, then we are doing you a favour rather than the other way round. The journal does not have a print version, so there is no need to fill issues with papers that do not meet its exacting standards. If a few months go by without a breakthrough, then that&#8217;s fine by us. The average number of papers published so far has been about ten per year, so publication in Breakthroughs is something of an event in the way that publication in a conventional journal, however prestigious, is not.</p>
<p>2. We have a large, youthful and diverse editorial board, consisting mainly of mathematicians who are active on the internet. If that is not your thing, then by all means submit to a conventional journal, but if you are part of the internet generation of mathematicians, then you may feel more at home at Breakthroughs. </p>
<p>3. The submission and refereeing process works as follows. Authors are required to submit not just their papers but also a short account of their work, in which they should explain their result in terms that are comprehensible to mathematicians outside their speciality, paying particular attention to what it is that makes it more than just an ordinary piece of very good mathematics. There is then an initial filtering process by the editorial board, helped by quick opinions solicited from experts in the relevant areas, which is based more on the short account of the paper than on the paper itself and is intended to establish whether the result is sufficiently interesting to sufficiently many editors to be publishable in Breakthroughs. In the rare event that it is, the paper then goes to a technical referee, whose job is not to evaluate the paper, but simply to comment on how it is written and to check that the author has done what he or she claims to have done. </p>
<p>4. The technical referee is not anonymous. Indeed, he or she is positively encouraged to interact with the author, asking for help in understanding difficult parts of a paper, and so on. Authors can even nominate their own technical referee if they wish, though Breakthroughs has the final say.</p>
<p>5. When the paper is published, it appears along with an explanation, written by a suitable member of the editorial board, of why it is deemed important enough to appear in Breakthroughs. This will typically be based on the short account provided by the author, as well as on remarks made by the referees, and possibly on other sources such as online discussion of the result (which will typically by this time be quite well known, though we aim to deal with our papers quickly). It also comes with a comments page, to which anybody can contribute remarks about the paper &#8212; such as alternative proofs of certain steps, notification of applications, and the like. The author can respond to these remarks. In these ways, we attempt to give a bit of publicity to the papers we publish, and to provide some context for the general reader.</p>
<p>6. We have made a serious attempt to be precise about what is required of a paper for it to be published in Breakthroughs. For details, see our page, &#8220;What is a breakthrough?&#8221; Of course, it is impossible to give exact necessary and sufficient conditions, but the fact that we at least try makes it clearer what it means to have a Breakthroughs in Mathematics paper on your CV than it would if we simply said that we had very high standards. </p>
<hr />
<p><strong>But still: what now?</strong></p>
<p>A journal like that is not going to answer the need for new journals to replace the overpriced conventional ones, but it could at least make electronic journals sexy in a way that they aren&#8217;t at the moment. It would also have the great virtue of not requiring much work of the editors. (It would require quite a lot of work per accepted paper, but the number of accepted papers would be very small.) </p>
<p>I&#8217;m aware though that I haven&#8217;t really faced up to the question of whether the editors of an Elsevier journal should be gently encouraged to consider switching publishers. As a matter of fact, I heard from an Elsevier editor recently. Let me call him/her X. X had approached a potential referee and had just received a refusal in which my earlier blog post was mentioned. X was somewhat critical of encouraging people not to referee for Elsevier journals, but said that he/she had some sympathy with the reasons. My guess is that on any journal there will be a small handful of very active editors, often just the official main editors, who in a sense &#8220;are&#8221; the journal and whose lives could be a little disrupted, and a much wider set of editors who wouldn&#8217;t at all mind moving if there were good reasons to do so. </p>
<p>How much of an imposition this would be would depend on a number of factors. One factor I find hard to judge because of my lack of experience running journals is probably the most important: the extent to which the smooth running of a journal depends on a good relationship between the managing editors and certain representatives, who may have genuine mathematical sympathies and expertise, of the publishers. Giving up a relationship like that would be a genuine sacrifice unless there was a realistic prospect of a new and similar relationship to take its place. Asking a print journal to go electronic would also be asking quite a lot, though, for reasons I indicated above, perhaps not too much.</p>
<p><strong>Combinatorics journals.</strong></p>
<p>In the course of writing the last couple of paragraphs I found myself thinking about the situation in combinatorics, and I have come to realize that I am on the editorial boards of at least two Springer journals: the Annals of Combinatorics, which is not really my kind of combinatorics and has involved zero work, and Combinatorica, which is one of my favourite maths journals. Since the general view seems to be that Springer has become a problem company as well, I should perhaps consider my position. I find it quite hard to get comprehensible information about the prices of these journals, but I think that if I could sell the back numbers that I&#8217;ve received from them at their official cost price, I could go on a round-the-world cruise and still have plenty of change. </p>
<p>What are the options if you want to publish a good result in combinatorics? (Here, I&#8217;m mainly talking about Hungarian-style combinatorics rather than enumerative or algebraic combinatorics.) If the result is interesting enough, you could of course publish in a general-interest journal, but let&#8217;s suppose you want it to appear in a specialist journal. The list of journals that would naturally spring to my mind is this. I&#8217;ll also give my associations with each one, which should not be taken seriously because I haven&#8217;t made any effort to test whether they are correct. I&#8217;m sure other people have different pecking orders.</p>
<p>Combinatorica: used to be regarded as the number one journal in combinatorics, and very possibly still is; quite slow and with a big backlog (that was true once but may be out of date). [Springer]</p>
<p>Discrete Mathematics: good solid journal; not of the absolute top rank. [Elsevier]</p>
<p>Journal of Combinatorial Theory A: good solid journal; not of the absolute top rank. [Elsevier]</p>
<p>Journal of Combinatorial Theory B: good solid journal; not of the absolute top rank. [Elsevier]</p>
<p>European Journal of Combinatorics: OK, but not as good as I thought it was when I submitted a paper I very much liked to it twenty years ago. [Elsevier]</p>
<p>Random Structures and Algorithms: very good; lots of interesting papers. [Wiley]</p>
<p>Combinatorics, Probability and Computing: a personal favourite; set up recently(ish) by B&eacute;la Bollob&aacute;s and maintains a high standard. [Cambridge University Press]</p>
<p>Electronic Journal of Combinatorics: now that I&#8217;ve actually looked into it &#8230; good.</p>
<p>I&#8217;ve probably missed some obvious further possibilities there, but the fact remains that that is my mental list of good combinatorial journals, and if I want to avoid the big publishing houses then my list goes down from eight to two. It&#8217;s not as bad as it sounds though. The only one of those journals that I&#8217;ve actually submitted to is Combinatorics, Probability and Computing, and the only one of the first six that I&#8217;d feel sad about boycotting is Combinatorica, though I also feel quite positive about Random Structures and Algorithms.   </p>
<p>So if anything is to be done about outrageously high journal prices in combinatorics, it looks as though new journals, or migration of existing ones, will be needed. (Incidentally, I&#8217;m writing all this on the assumption that we stick with something close to the current system of journals providing varying stamps of quality. Obviously other systems are possible, but persuading large numbers of mathematicians to move to those systems would be much more of a challenge.) </p>
<p><strong>Are there two kinds of mathematician?</strong></p>
<p>I was quite surprised that the reaction to the idea of a boycott was as positive as it was: I had expected a more divided response. I still wonder whether the true response <em>is</em> more divided. Could it be that the kind of mathematician who participates fully in online discussions on blogs, Mathoverflow etc. is naturally enthusiastic, whereas a more traditionally-minded mathematician just wants to be left alone to continue with a way of doing things that seems perfectly satisfactory? If so, then the apparently strong support could be misleading. I think it is this thought that makes me want to tread carefully after reading Michael Harris&#8217;s suggestion. But treading carefully doesn&#8217;t necessarily mean not treading at all. I&#8217;d be very interested to know what other people think about this: is there some moment that needs to be seized, or should we simply sit back and watch the number of signatures grow?</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3940/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3940/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3940/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3940/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3940/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3940/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3940/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3940/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3940/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3940/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3940/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3940/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3940/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3940/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3940&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2012/01/29/whats-wrong-with-electronic-journals/feed/</wfw:commentRss>
		<slash:comments>38</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>http://thecostofknowledge.com</title>
		<link>http://gowers.wordpress.com/2012/01/23/httpthecostofknowledge-com/</link>
		<comments>http://gowers.wordpress.com/2012/01/23/httpthecostofknowledge-com/#comments</comments>
		<pubDate>Mon, 23 Jan 2012 17:33:32 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3935</guid>
		<description><![CDATA[Many thanks to Tyler Neylon for designing a website where one can declare one&#8217;s unwillingness to work for Elsevier journals. Already, without any announcement apart from brief mentions quite some way into the comments on the last post, it has 31 signatures, many of them from France, where for various reasons they are particularly annoyed [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3935&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Many thanks to Tyler Neylon for designing <a href="http://thecostofknowledge.com/">a website where one can declare one&#8217;s unwillingness to work for Elsevier journals</a>. Already, without any announcement apart from brief mentions quite some way into the comments on the last post, it has 31 signatures, many of them from France, where for various reasons they are particularly annoyed with Elsevier.</p>
<p>This post is primarily to give the site some visibility, which I&#8217;ll also do on Google+ (if you support the venture, then please spread the word). It is not necessarily to persuade you to sign. I well understand that we are all in different situations and signing is easier for some people than others. But one thing I would definitely say is that if you already have a private non-cooperation policy (as I myself have done for years) then you will have much more effect if you go public about it. As I said in my previous post, the more people who sign, the more morally and socially acceptable it becomes to sign too: a private protest is just a nuisance to other mathematicians, but larger and more public one may have a chance of achieving something. So I hope that each signature will beget several others, at least for a while.<br />
<span id="more-3935"></span></p>
<p>In the interests of balance, let me briefly mention two arguments <em>against</em> signing. (If you can think of others, then please let me know in the comments.) One is that Elsevier already allows authors to keep versions of their papers on the arXiv. This considerably weakens the argument that Elsevier papers, once published, disappear behind a very expensive paywall. (It also means that submitting to an Elsevier journal and <em>not</em> putting your article on the arXiv is a dereliction of duty.) Nevertheless, having to make do with arXiv versions is an inconvenience. For example, the page references in the arXiv version will be different from those in the journal. (Another principle: if you refer to an Elsevier paper, do so in a page-independent way such as, &#8220;See the discussion just after Lemma 3.1 in [XYZ].&#8221;) Also, it is not standard practice to refer to the arXiv versions of other papers if there are print versions.</p>
<p>The other argument that carries some force is that Springer and other publishers are just as bad as Elsevier. For instance, Springer too goes in for bundling. David Savitt put the counterargument nicely in a comment on the previous post, of which I quote a paragraph.</p>
<blockquote><p>Certainly one can debate whether Elsevier is the right specific target, but I do think that if one wants to build some sort of movement, it&#8217;s best to start out in a relatively specific way.  Targeting a particular bad behavior in a broad way may leave so few alternatives as to be impractical for many individuals, and if individuals can&#8217;t make a pledge and stick to it then one isn&#8217;t going to get anywhere.  You also have to ask, pragmatically, what&#8217;s going to get a large number of people to participate?  A high-minded commitment to a broad principle takes much more effort than a boycott of a specific company.</p></blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3935/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3935/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3935/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3935/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3935/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3935/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3935/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3935/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3935/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3935/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3935/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3935/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3935/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3935/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3935&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2012/01/23/httpthecostofknowledge-com/feed/</wfw:commentRss>
		<slash:comments>54</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Elsevier &#8212; my part in its downfall</title>
		<link>http://gowers.wordpress.com/2012/01/21/elsevier-my-part-in-its-downfall/</link>
		<comments>http://gowers.wordpress.com/2012/01/21/elsevier-my-part-in-its-downfall/#comments</comments>
		<pubDate>Sat, 21 Jan 2012 17:30:31 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3912</guid>
		<description><![CDATA[The Dutch publisher Elsevier publishes many of the world&#8217;s best known mathematics journals, including Advances in Mathematics, Comptes Rendus, Discrete Mathematics, The European Journal of Combinatorics, Historia Mathematica, Journal of Algebra, Journal of Approximation Theory, Journal of Combinatorics Series A, Journal of Functional Analysis, Journal of Geometry and Physics, Journal of Mathematical Analysis and Applications, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3912&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The Dutch publisher Elsevier publishes many of the world&#8217;s best known mathematics journals, including Advances in Mathematics, Comptes Rendus, Discrete Mathematics, The European Journal of Combinatorics, Historia Mathematica, Journal of Algebra, Journal of Approximation Theory, Journal of Combinatorics Series A, Journal of Functional Analysis, Journal of Geometry and Physics, Journal of Mathematical Analysis and Applications, Journal of Number Theory, Topology, and Topology and its Applications. For many years, it has also been heavily criticized for its business practices. Let me briefly summarize these criticisms.</p>
<p>1. It charges very high prices &#8212; so far above the average that it seems quite extraordinary that they can get away with it.</p>
<p>2. One method that they have for getting away with it is a practice known as &#8220;bundling&#8221;, where instead of giving libraries the choice of which journals they want to subscribe to, they offer them the choice between a large collection of journals (chosen by them) or nothing at all. So if <em>some</em> Elsevier journals in the &#8220;bundle&#8221; are indispensable to a library, that library is forced to subscribe at very high subscription rates to a large number of journals, across all the sciences, many of which they do not want. (The journal Chaos, Solitons and Fractals is a notorious example of a journal that is regarded as a joke by many mathematicians, but which libraries all round the world must nevertheless subscribe to.) Given that libraries have limited budgets, this often means that they cannot subscribe to journals that they would much rather subscribe to, so it is not just libraries that are harmed, but other publishers, which is of course part of the motivation for the scheme. </p>
<p>3. If libraries attempt to negotiate better deals, Elsevier is ruthless about cutting off access to all their journals.</p>
<p>4. Elsevier supports many of the measures, such as the <a href="http://en.wikipedia.org/wiki/Research_Works_Act">Research Works Act</a>, that attempt to stop the move to open access. They also supported SOPA and PIPA and lobbied strongly for them.<br />
<span id="more-3912"></span></p>
<p>I could carry on, but I&#8217;ll leave it there. </p>
<p>It might seem inexplicable that this situation has been allowed to continue. After all, mathematicians (and other scientists) have been complaining about it for a long time. Why can&#8217;t we just tell Elsevier that we no longer wish to publish with them?</p>
<p>Well, part of the answer is that we <em>can</em>. A famous (and not unique) example where we did so was the resignation of the entire editorial board of Topology and the founding of The Journal of Topology &#8212; the story is told briefly <a href="http://en.wikipedia.org/wiki/Topology_(journal)">here</a>. But as the list above shows, such examples are very much the exception rather than the rule, so the basic question remains: why do we allow ourselves to be messed about to this extraordinary extent, when one would have thought that nothing would be easier than to do without them?</p>
<p>A possible explanation is that to do something about the situation requires coordinated action. Even if one library refuses to subscribe to Elsevier journals, plenty of others will feel that they can&#8217;t refuse, and Elsevier won&#8217;t mind too much. But if all libraries were prepared to club together and negotiate jointly, doing a kind of reverse bundling &#8212; accept this deal or none of us will subscribe to any of your journals &#8212; then Elsevier&#8217;s profits (which are huge, by the way) would be genuinely threatened. However, it seems unlikely that any such massive coordination between libraries will ever take place.</p>
<p>What about coordination between academics? What is to stop all the other editorial boards of Elsevier journals following the example of the board of the Journal of Topology? I actually don&#8217;t know the answer to that: I can only assume that not enough people on those editorial boards care to make it worth it to them to go through what is likely to be a somewhat unpleasant and time-consuming process.</p>
<p>If top-down approaches to the problem don&#8217;t work, then what about bottom-up approaches? Why do any of us publish papers in Elsevier journals? Let me answer that question in my own case. I have a paper in the European Journal of Combinatorics, which I submitted about 20 years ago, before I knew anything about the objections to Elsevier. And what&#8217;s more, I didn&#8217;t know it was an Elsevier journal until a few days ago. (Part of my reason for listing the journals at the beginning of this post was to make the second excuse less valid for anyone who reads this. A more complete list can be found <a href="http://www.elsevier.com/wps/find/P11.cws_home/mathjournals">here</a>.) </p>
<p>Once I did hear about Elsevier&#8217;s behaviour, I made a conscious decision not to publish in Elsevier journals and I started to feel bad about cooperating with them in any way. I didn&#8217;t go as far as to refuse, but if, say, I was asked to join the editorial board of an Elsevier journal and wasn&#8217;t quite sure I wanted to, then the fact that it was Elsevier was enough to make my mind up. (This actually happened. I was a little cowardly and gave it as an additional reason for reluctance rather than the main reason, but I did at least mention it.) I am not knowingly on the editorial board of any Elsevier journal, and haven&#8217;t been in the past either.</p>
<p>Now, however, I have decided that my previous quiet approach was not enough. I think another reason that we cooperate with Elsevier is simply that it is embarrassing not to. If I&#8217;m asked to referee a paper for an Elsevier journal and I am clearly an appropriate choice of referee, then refusing to do it feels like a criticism of the editor who has asked me, who may well be somebody I know. It also feels like shirking my duty and slightly letting down the authors, who may well also be people I know. </p>
<p>It is because of that that the <em>moral</em> argument in favour of refusing to cooperate, as an individual, with Elsevier is not quite straightforward. Indeed, if we were just to accept Elsevier&#8217;s abuses as an unfortunate fact of life that is not going to go away, then there would be a genuine argument that refusing to cooperate with them is the wrong thing to do. However, I think that the abuses <em>are</em> eventually going to go away &#8212; the internet will see to that &#8212; so I think that the doing-my-duty argument is outweighed by the argument that it is in the interests of the mathematical community to get to that happy day as soon as we can. I also don&#8217;t see any argument at all against refusing to submit papers to Elsevier journals.</p>
<p>So I am not only going to refuse to have anything to do with Elsevier journals from now on, but I am saying so publicly. I am by no means the first person to do this, but the more of us there are, the more socially acceptable it becomes, and that is my main reason for writing this post. </p>
<p>It occurs to me that it might help if there were a website somewhere, where mathematicians who have decided not to contribute in any way to Elsevier journals could sign their names electronically. I think that some people would be encouraged to take a stand if they could see that many others were already doing so, and that it would be a good way of making that stand public. Perhaps such a site already exists, in which case I&#8217;d like to hear about it and add my name. If it doesn&#8217;t, it should be pretty easy to set up, but way beyond my competence I&#8217;m afraid. Is there anyone out there who feels like doing it? </p>
<p>Returning to the subject of morality, I don&#8217;t think it is helpful to accuse Elsevier of immoral behaviour: they are a big business and they want to maximize their profits, as businesses do. I see the argument as a straightforward practical one. Yes, they are like that, as one would expect, but we have much greater bargaining power than we are wielding at the moment, for the very simple reason that we don&#8217;t actually <em>need</em> their services. That is not to say that morality doesn&#8217;t come into it, but the moral issues are between mathematicians and other mathematicians rather than between mathematicians and Elsevier. In brief, if you publish in Elsevier journals you are making it easier for Elsevier to take action that harms academic institutions, so you shouldn&#8217;t. (I&#8217;m thinking of stories I&#8217;ve been told about mathematicians at major universities who have been cut off from Elsevier journals. Something I don&#8217;t know, but would be interested to learn, is whether mathematicians in developing countries can afford to get access to Elsevier journals. If not, then that would be another powerful moral argument against submitting to them.) </p>
<p>Even if so many mathematicians refused to cooperate with Elsevier that the quality of their journals plummeted, that wouldn&#8217;t necessarily force Elsevier to change its ways, since it could continue to bundle its by now rubbishy mathematics journals together with important journals in physics, chemistry and biology. However, it would be a powerful gesture &#8212; perhaps even powerful enough for other sciences to follow suit eventually &#8212; and at least mathematics would be free of the problem. </p>
<p>One final remark is that Elsevier is not the only publisher to behave in an objectionable way. However, it seems to be the worst. </p>
<p>PS For non-British readers, the titles of this post and the previous one are an oblique reference to <a href="http://www.amazon.co.uk/Adolf-Hitler-Part-his-Downfall/dp/0140035206">this book</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3912/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3912/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3912/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3912/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3912/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3912/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3912/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3912/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3912/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3912/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3912/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3912/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3912/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3912/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3912&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2012/01/21/elsevier-my-part-in-its-downfall/feed/</wfw:commentRss>
		<slash:comments>197</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>SOPA &#8212; my part in its downfall</title>
		<link>http://gowers.wordpress.com/2012/01/17/sopa-my-part-in-its-downfall/</link>
		<comments>http://gowers.wordpress.com/2012/01/17/sopa-my-part-in-its-downfall/#comments</comments>
		<pubDate>Tue, 17 Jan 2012 10:23:25 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3902</guid>
		<description><![CDATA[If you haven&#8217;t heard, SOPA, which stands for Stop Online Piracy Act, is a US bill that was proposed in order to do what its name suggests. Although it has been defeated for now, its proponents have not given up, so many websites, notably including Wikipedia, are going on strike tomorrow (January 18th) in order [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3902&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>If you haven&#8217;t heard, SOPA, which stands for Stop Online Piracy Act, is a US bill that was proposed in order to do what its name suggests. Although it has been defeated for now, its proponents have not given up, so many websites, notably including Wikipedia, are going on strike tomorrow (January 18th) in order to show just how potentially damaging the bill could be to the internet. I haven&#8217;t looked in much detail into what the adverse consequences of SOPA would be, but I&#8217;ve read enough, from people whose opinions I trust, to believe that I should join this strike. My technical competence is insufficient to follow the instructions that have been offered for doing this (and the same applies to any instructions that anyone reading this might feel moved to offer so I suggest not bothering). Therefore, I plan to mark this blog as private (and therefore inaccessible) for the day, an operation that I will undo on Thursday.</p>
<p>If you&#8217;d like more details about what&#8217;s wrong with the bill, then Google &#8220;SOPA&#8221; and you&#8217;ll find all you could possibly want.</p>
<p><strong>Edit:</strong> I was about to change the blog to private when I noticed that WordPress has a Protest SOPA/PIPA setting. I&#8217;ve gone for that. It results in the ribbon you see in the top right-hand corner of this page, and a total blackout, with a page explaining why, from 8am to 8pm EST. So that will kick in properly at 1pm UK time.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3902/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3902/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3902/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3902/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3902/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3902/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3902/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3902/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3902/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3902/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3902/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3902/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3902/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3902/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3902&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2012/01/17/sopa-my-part-in-its-downfall/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Farewell to a pen-friend</title>
		<link>http://gowers.wordpress.com/2011/12/18/farewell-to-a-pen-friend/</link>
		<comments>http://gowers.wordpress.com/2011/12/18/farewell-to-a-pen-friend/#comments</comments>
		<pubDate>Sun, 18 Dec 2011 17:07:16 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3868</guid>
		<description><![CDATA[A few days ago I learnt from the Guardian of the death of the novelist and critic Gilbert Adair. I was saddened by this, partly because I have hugely enjoyed his writing (though I&#8217;m glad to say that I haven&#8217;t read his entire oeuvre, so there are still treats in store) and partly because I [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3868&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A few days ago I learnt <a href="http://www.guardian.co.uk/film/2011/dec/09/gilbert-adair">from the Guardian</a> of the death of the novelist and critic <a href="http://en.wikipedia.org/wiki/Gilbert_Adair">Gilbert Adair</a>. I was saddened by this, partly because I have hugely enjoyed his writing (though I&#8217;m glad to say that I haven&#8217;t read his entire oeuvre, so there are still treats in store) and partly because I knew him. The title of this post is a pun of a kind I hope he would have approved of: our interactions were mostly by email, but one can also take the &#8220;pen&#8221; to mean &#8220;almost&#8221; (as in &#8220;peninsula&#8221;), which is why I used a hyphen. We met a couple of times, and might have become proper friends if I had been less socially lazy. It turns out that he had a stroke a year ago, but I didn&#8217;t hear about it, so his death just over a week ago came as a surprise and leaves me regretting that I didn&#8217;t see more of him while I had the chance.</p>
<p>Since there&#8217;s nothing I can do about that, I thought that I&#8217;d try to use this blog as an outlet for the resulting feeling of loss, which is out of proportion to the amount that I actually had to do with him. Or perhaps it isn&#8217;t, since the very fact that I didn&#8217;t see him much is part of what now bothers me. It is also why I had no idea that my last contact with him might be my last, and why his death now seems a bit unreal.</p>
<p>A maths blog is not a completely inappropriate place to write about him, because I met him through mathematics and it was because of mathematics, which fascinated him, that that initial meeting led to a couple of further meetings. A secondary purpose of this post is to recommend his books, which are extremely clever in a way that many mathematicians would like. I&#8217;ll describe some of them as I go along.<br />
<span id="more-3868"></span></p>
<p>If you have a long memory, then perhaps you will remember an event called The Faber Challenge, which accompanied the publication of Apostolos Doxiadis&#8217;s novel <a href="http://www.amazon.co.uk/Petros-Goldbachs-Conjecture-Apostolos-Doxiadis/dp/0571205119/ref=sr_1_1?s=books&amp;ie=UTF8&amp;qid=1324201804&amp;sr=1-1">Uncle Petros and Goldbach&#8217;s Conjecture</a>. The challenge was to prove Goldbach&#8217;s conjecture within two years, for which they offered a prize of a million pounds. This challenge was issued at around the same time as the announcement of the <a href="http://en.wikipedia.org/wiki/Millennium_Prize_Problems">Clay Millennium Problems</a>, which was a little irritating to the Clay people since their prizes were &#8220;only&#8221; a million dollars. Of course, the time limit rendered the Faber prize more or less meaningless, which is why people do not talk about it now. Indeed, they didn&#8217;t expect to have to pay up, so instead of setting aside a fund of a million pounds, they simply took out insurance against the highly unlikely event of somebody solving the problem. I never found out what the premium was.</p>
<p>I was contacted by Toby Faber, who worked for Faber at the time (no it wasn&#8217;t a coincidence) and who had been at school with me, because they needed a small team of mathematicians to look at any serious attempts that there might be. I was assured that they would have a rigorous initial filtering process that would mean that I would not need to look through hundreds of documents written by cranks, so I agreed to do it. As a result, I was invited to the launch party in London, where I met Apostolos Doxiadis for the first time. After the party, Toby asked whether I wanted to join a few people, including Apostolos, to go and eat at a nearby restaurant. That felt like a good addition to life&#8217;s rich tapestry, so I said yes. It was one of those situations where the others knew each other better than they knew me, but that meant that I got asked quite a lot of questions. </p>
<p>One of the people there was an amusing man in David Hockney style glasses (I read now that he was in fact a friend of David Hockney) who was sitting at the end of the table over to my left. I can&#8217;t remember much else about him from that occasion, but afterwards Toby Faber and I shared a taxi and for some reason it came up in our conversation that that man had been Gilbert Adair. Damn, I thought, and said, because I was a fan of his film reviews in the Independent on Sunday (which at the time was a good paper) and would have liked to say so. But the opportunity was not gone for ever, because he wanted to say something to me too.</p>
<p>It turned out that he had developed a theory of infinitesimals and wanted to know whether I found it interesting. I more or less knew before looking at it that I wouldn&#8217;t, since whatever he had to say would almost certainly be either wrong or correct but well known. We had a brief email exchange about it, and then he sent me some speculations about numbers such as <img src='http://s0.wp.com/latex.php?latex=0.000%5Cdots+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='0.000&#92;dots 1' title='0.000&#92;dots 1' class='latex' />, where there are infinitely many zeros before the 1. My response to his basic ideas was that adding to the number system becomes interesting only if you can extend the basic arithmetical operations to the new numbers. (The kind of question I asked was what happened to his number if he multiplied it by 10.) A slightly subtler response for a non-mathematician to grasp was that even if you manage that, which is possible (as is well known), you still need to explain why the extended system is worth bothering with. He thought that Cantor&#8217;s higher infinities were just an amusing game that mathematicians could do without, and that he was doing something similar for infinitesimals.</p>
<p>Despite my attempts to pour cold water on his ideas, he suggested more than once that if I was ever in London, then we should get together for lunch. My parents live in London, and the idea of properly meeting Gilbert Adair appealed to me so I took him up on it.</p>
<p>I can&#8217;t remember at what point I put two and two together and realized that he was the person I had read about a few years earlier who had successfully taken on the crazy project of translating Georges Perec&#8217;s famous book <a href="http://fr.wikipedia.org/wiki/La_Disparition_(roman)">La Disparition</a>, a novel (in French) that does not use the letter E. It goes without saying that <a href="http://en.wikipedia.org/wiki/A_Void">Gilbert Adair&#8217;s translation, A Void</a>, also does without the letter E. I remember being furious with myself when I got through an entire review of the book without noticing that it too avoided the letter E. I&#8217;ll resist the temptation to play that kind of game in this post. (If you want to see a recent example of somebody who didn&#8217;t resist it, try <a href="http://www.radix-communications.com/about-a-void/">this</a>.)</p>
<p>More generally, I find it hard to disentangle my memories of getting to know Gilbert Adair the man and getting to know his books, which is partly because he put quite a lot of himself in his books. However, I do know that we met for lunch in a restaurant in Notting Hill (not far from where he lived), about which I don&#8217;t remember too much apart from the general atmosphere: the restaurant was fairly empty, and the conversation slightly awkward &#8212; it was hard to imagine that anything non-mathematical I had to say would be of interest to such a witty and cultured man, and there was only so much to be said about mathematics. </p>
<p>Fairly soon after that I went to Princeton for two years, during which time I finished writing <a href="http://www.amazon.co.uk/Mathematics-Very-Short-Introduction-Introductions/dp/0192853619/ref=pd_sim_b_1">Mathematics, A Very Short Introduction</a>. He used to devour popular mathematics books, so I had told him I was writing one myself. Since he was exactly the kind of reader I had in mind &#8212; someone who wasn&#8217;t satisfied with gee-whizzery and preferred to understand what he was reading &#8212; I asked him if he would be prepared to read a draft of it for me and let me know if there was anything he didn&#8217;t like about it. </p>
<p>In the process of checking back through my emails, I see that I&#8217;ve actually misremembered how things happened. What actually happened was that we had a correspondence about infinitesimals in 2000 and the lunch was in the summer of 2001 when I was back from Princeton for the summer. I sent him a copy of Mathematics, A Very Short Introduction late that autumn. He wrote back with a list of comments, some minor and some less so. Every so often, someone makes a comment that actually changes my writing style, and one of his came into that category. Let me quote him exactly:</p>
<blockquote><p>As I recall, you asked me, in one of your emails, to raise any problems I might have with the book&#8217;s literary style. I have none&#8230; except that, for my taste, you&#8217;re a little too fond of commas, especially those followed by &#8216;and&#8217;, and also without a change of subject, two factors which I would say obviate the need for them. For just one example, the very first sentence on page 1.</p></blockquote>
<p>I now think much harder about commas before the word &#8220;and&#8221;. The one time I feel they are needed is in sentences with another lower-level &#8220;and&#8221;, such as, &#8220;I went for a walk with Peter and Jane, and the weather was glorious.&#8221; Without the comma, there&#8217;s a danger that the reader will think for a split second that the weather came along for the walk too. (This is not meant as a counterexample to Gilbert Adair&#8217;s advice.)</p>
<p>Another of his remarks was the following.</p>
<blockquote><p>Finally, the last chapter. It still feels to me slightly tacked on, probably for no other reason than that there are too few issues discussed. I would propose that you need at least ten for the chapter to feel like an integral part of the book. A trio of suggestions: a) Why do many people so actively detest mathematics?&#8217;; b) What is the point of pure mathematics?; and c) Is there any likelihood today of an amateur solving an important mathematical problem?</p></blockquote>
<p>I didn&#8217;t fancy b), since the whole book was meant to be about that, but I added sections on his topics a) and c) and I think they enhanced the book. There was one other connection between him and the book, which was that when Oxford University Press told me they needed an &#8220;endorsement&#8221; on the cover, he was a natural choice to provide it. It&#8217;s a bit embarrassing to ask somebody if they will please write a sentence or two praising one&#8217;s book, and I don&#8217;t know whether he actually wrote the sentence that now appears on it or just put his name to a suggestion of OUP&#8217;s. What it says is, &#8220;A marvellously lucid guide to the beauty and mystery of numbers,&#8221; which wasn&#8217;t quite the intention of the book (which was about more than just numbers and was trying to demystify) but which did the job. [Edit: I've found an email that suggests that he planned to write something rather different, so I suspect OUP here.]</p>
<p>When I arrived in Princeton with a young family, we left almost all our books behind in England, so one of the first things we did was go to sales of second-hand and remaindered books so that we could stock up our shelves. Amongst the many books that we acquired, two were by Gilbert Adair. One was <a href="http://en.wikipedia.org/wiki/Alice_Through_the_Needle's_Eye">Alice Through the Needle&#8217;s Eye</a>, a third volume of the famous Alice books that is so well done (I read it to my children) that it might as well have been written by Lewis Carroll himself. That&#8217;s a very strong claim, I realize, and may make you suspect that I was just too insensitive to notice the inevitable false notes. If that&#8217;s what you think, I suggest you try to get hold of the book (which unfortunately isn&#8217;t easy). </p>
<p>The other book was <a href="http://www.amazon.co.uk/Surfing-Zeitgeist-Gilbert-Adair/dp/0571179916">Surfing the Zeitgeist</a>, a collection of short essays that originated as a weekly column in the Sunday Times, each one called &#8220;On X&#8221; for some X. To give you a flavour of these, here&#8217;s a paragraph from &#8220;On the theatre&#8221;, in which he argues that the theatre should not be regarded as somehow more elevated an art form than cinema.</p>
<blockquote><p>Nor is it just a question of how they [=people who go to the theatre] talk, but of how they <em>laugh</em>. This is a fiendishly elusive notion, not easy to communicate in print, but let me try anyway. Think of the last time you were in a theatre. Think of the sense of occasion that you experienced, the sense that, for once in your life, you were doing something exceptional. You bought a programme, you took your seat, you removed your overcoat, you positively rustled with expectation. Then the lights dimmed . . . Now it is unimportant whether the play was a comedy or not, whether it was by Shakespeare or David Hare or someone you had never heard of, when one of the characters on that stage said something that was even mildly amusing, you laughed. At Shakespeare&#8217;s feeblest puns, puns a thousand times overtaken by the three centuries which separate him from us, you actually found yourself laughing aloud &#8212; as loudly as you would laugh at the boss&#8217;s jokes during an after-dinner speech or the best man&#8217;s at a wedding reception. You laughed, in short, because it would have been <em>rude</em> not to, because the person making the joke was standing in front of you. In the cinema, by contrast, where there is seldom a sense of occasion, where the public just hunkers down and patiently or impassively bides its time through the supporting programme, and where there is nothing in front of you but images on a screen, you laugh when, <em>and only when</em>, the film strikes you as genuinely funny. Is it not so?</p></blockquote>
<p>If you are an ardent theatre fan, then don&#8217;t let that excerpt put you off, as there is plenty else in the book. One thing you should know, however, is that Gilbert Adair was an ardent cinema fan. He was also a Francophile,  spending several years of his life in Paris and frequenting the cinemas there. (One of my biggest regrets about his death is that he never got to meet my Parisian wife, who has also spent many hours in Parisian cinemas.)</p>
<p>Another of his books, <a href="http://www.amazon.co.uk/Holy-Innocents-Gilbert-Adair/dp/0749390093">The Holy Innocents</a>, opens with a description of <a href="http://en.wikipedia.org/wiki/Cinémathèque_Française">La Cinémathèque Française</a> as it was in 1968. Even if you haven&#8217;t heard of Gilbert Adair, you may well have heard of the film, Bernardo Bertolucci&#8217;s <a href="http://en.wikipedia.org/wiki/The_Dreamers_(film)">The Dreamers</a>, that this book was turned into. I first heard about that in an email message in 2002, but he told me more when we had lunch again in the summer of 2003 &#8212; which, I can hardly believe now, was the last time I saw him in person. He said that he hated the novel (which was his first) and that if he ever came across a copy in a second-hand bookshop he would buy it to make sure it didn&#8217;t get read. I may have got the details slightly wrong, but he had a funny conversation with his agent that went something like this. </p>
<p>&#8220;Gilbert, there&#8217;s someone who wants to film The Holy Innocents.&#8221; </p>
<p>&#8220;But you know perfectly well I would never allow that to be filmed.&#8221; </p>
<p>&#8220;It&#8217;s Bernardo Bertolucci.&#8221;</p>
<p>He wrote the screenplay for the film and was closely involved with the filming itself. He also rewrote the book: it now exists as The Dreamers. I&#8217;m glad to say that there was one second-hand bookshop that I got to before he did, so I&#8217;ve got a copy of The Holy Innocents, and I don&#8217;t understand what he fails to see in it. His reaction to hearing that I had found it: &#8220;I hate to hear (as I do more often than you would imagine) of friends coming across copies of my hated first novel in quaint second-hand bookshops, but I accept there&#8217;s nothing I can do about it. Oh well.&#8221; It is a reworking of Jean Cocteau&#8217;s <a href="http://en.wikipedia.org/wiki/Les_Enfants_Terribles">Les Enfants Terribles</a>. Here&#8217;s what Anthony Burgess had to say about it (this comes from the back cover).</p>
<blockquote><p>Manifestly in the tradition of Jean Cocteau&#8217;s <em>Les Enfants Terribles</em>, considered a masterpiece, this is a far better book.</p></blockquote>
<p>So far, I&#8217;ve mentioned a translation of La Disparition and reworkings of Lewis Carroll and Jean Cocteau. Maybe now is the time to say that almost all his books were based on existing works in one way or another. Here is a footnote from &#8220;On transtextuality&#8221;, one of the essays in Surfing the Zeitgeist.</p>
<blockquote><p>I should perhaps declare an interest here. As it happens, most of my own published fiction is (neatly rhyming with &#8220;incestuous&#8221;) &#8220;palimpsestuous&#8221; in inspiration; and, in even the most laudatory reviews I have received, there has been detectable an implicit reproach that I have yet to embark on what might be called a &#8220;solo flight&#8221;. &#8220;When, oh, when,&#8221; is what I keep reading, &#8220;will Adair become his own man . . .?&#8221;</p></blockquote>
<p>Another book that I acquired in Princeton (but this one I ordered from Amazon) was <a href="http://www.amazon.co.uk/Love-Death-Island-Gilbert-Adair/dp/0749336366">Love and Death on Long Island</a>, which is inspired by Death in Venice. In it, an elderly man accidentally goes to see the wrong film in a cinema and becomes obsessed by the young male star of that film, tracking him down to where he lives in Long Island. The book was made into a critically acclaimed film of the same name, but I can&#8217;t bear to see the film because the book has a kind of perfection about it that I don&#8217;t want to spoil.</p>
<p>At around this time he consulted me about a mathematical subplot that he wanted to incorporate into a novel. He was looking for an example of a plausible contemporary mathematical controversy. It was a difficult challenge, since there is such a widely shared notion of what constitutes an acceptable proof that true controversy in mathematics is very rare. One idea I had was that perhaps somebody could prove by indirect means that a proof of some famous conjecture existed but be unable to provide that proof. I wasn&#8217;t sure that that was possible &#8212; wouldn&#8217;t the proof that a proof existed somehow constitute a proof? &#8212; but it did feel as though it might be and that if it was then then it would be the kind of situation that could lead to quite a bit of discussion. Another thought I had was that perhaps somebody could solve a famous problem but use large cardinals in an essential way. In the end, however, he gave up on the idea, saying, &#8220;I found it simply impossible to reconcile the comprehension of &#8216;the non-professional reader&#8217; with my own aversion to gross over-simplification.&#8221;</p>
<p>Going back to our lunch later that year, which was in a street parallel to and not far from the Tottenham Court Road (although I invited him this time, I had no idea where a suitable place might be, so rather ineptly I got him to suggest somewhere, which turned out to be closed, so we went to another place nearby and sat outside), he told me a story that I find so extraordinary that I&#8217;ve never forgotten it. For reasons he didn&#8217;t go into, he was estranged from his family. The last time he had seen any of them was when he was living in Paris. He was walking down the Boulevard Saint Michel towards the Boulevard Saint Germain when he saw, walking towards him, his brother, whom he had not seen for many years, and the two of them simply pretended not to know each other.</p>
<p>There was a lot else that he alluded to but never talked about in detail. Over the years since 2003 a pattern developed, or rather continued, where we would have a brief flurry of emails, usually prompted by his wanting to explore another mathematical idea, followed by a couple of years of silence. Each time the silence was broken, he would say things like, &#8220;I&#8217;m glad to hear things have been going well for you. I would be lying if I said they had for me.&#8221; But he wouldn&#8217;t say what he meant by that, except that at least some of it was a matter of poor health.</p>
<p>Also at the lunch in 2003 he gave me a copy of his (I think latest) book <a href="http://www.amazon.com/Real-Tadzio-Thomas-Venice-Inspired/dp/0786712473">The Real Tadzio</a>, about the boy who inspired Thomas Mann&#8217;s Death in Venice, later of course to become Visconti&#8217;s famous film of the same name. I&#8217;m not going to say much about the book except that it is short and I finished it on the same day, as a result of which I remember very little about it. (One thing I&#8217;m realizing as I write this is that I can enjoy not just the books of Gilbert Adair that I have not yet read, but also, for a second time, several of the ones I have read.) Glancing at it quickly, I see that it is not just about the boy, who was called Wladyslaw Moes, but about Thomas Mann, Visconti, the film, Bjorn Andresen (the actor who played the boy in the film), Benjamin Britten, and much else besides.</p>
<p>One of Gilbert Adair&#8217;s last projects was a pastiche of Agatha Christie, which, slightly to his surprise since he didn&#8217;t like repeating himself, turned into a trilogy. The first book was called The Act of Roger Murgatroyd, which, had I been more of an Agatha Christie fan, I would have recognised as a reference to her book The Murder of Roger Ackroyd. The other two are <a href="http://www.amazon.co.uk/Mysterious-Affair-Style-Sequel-Trilogy/dp/0571234259/ref=pd_sim_b_1">A Mysterious Affair of Style</a>, and <a href="http://www.amazon.co.uk/Then-There-Evadne-Mount-Mystery/dp/0571238815/ref=pd_sim_b_2">And Then There Was No One</a>. In these books he has about as much fun with the genre as it is possible to have.</p>
<p>As for our correspondence, the last exchange we had was in 2009. As ever, it started with his wanting to know what I thought about a piece of mathematics he had come up with: in this case the observation that the sums of certain arithmetic progressions were always composite &#8212; not too surprising when you think about the formula for the sum of an AP, and also how that formula is derived. He understood perfectly well that what he had noticed was likely to be either easy or not always true, so took it perfectly well when I told him it was the former. </p>
<p>Long-term readers of this blog may remember a <a href="http://gowers.wordpress.com/2009/06/05/swine-flu-and-british-public-health-policy/">post about swine flu</a> from a couple of years ago. When I wrote my first reply, I was more or less confined to the house because a son of mine had had it (and gained fifteen minutes of fame for shutting down Eton), so I mentioned that. I also mentioned that I had remarried and had another son, Octave, then 18 months. His response contained a rather Adairish surprise:</p>
<blockquote><p>Thank you for your prompt and really rather unusually interesting email. Only the other day, for example, I was reading a newspaper article about swine flu at Eton and suddenly here it is in my own correspondence. Also, I have been working on a film script (treatment, rather, for the moment) set in Scotland in the nineteenth century but featuring a Frenchman as the protagonist’s best friend and confidant. What name did I give him? Octave! Well, I never, as my mother never tired of saying.</p></blockquote>
<p>I never found out anything further about the film treatment, so I don&#8217;t know whether it ever came to fruition.</p>
<p>I&#8217;ll close by drawing attention to <a href="http://www.guardian.co.uk/commentisfree/2011/dec/11/henry-porter-gilbert-adair-on-human-kindness">an article in the Guardian</a> about his treatment by the National Health Service after his stroke. Apparently he was extremely impressed by it and wanted that fact to be publicized.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3868/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3868/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3868/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3868/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3868/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3868/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3868/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3868/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3868/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3868/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3868/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3868/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3868/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3868/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3868&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/12/18/farewell-to-a-pen-friend/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Group actions IV: intrinsic actions</title>
		<link>http://gowers.wordpress.com/2011/12/10/group-actions-iv-intrinsic-actions/</link>
		<comments>http://gowers.wordpress.com/2011/12/10/group-actions-iv-intrinsic-actions/#comments</comments>
		<pubDate>Sat, 10 Dec 2011 17:42:16 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[IA Groups]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3848</guid>
		<description><![CDATA[I have a confession to make. When I was an undergraduate at Cambridge (hmm, that sounds as though it might be the beginning of quite an interesting confession, so I&#8217;d better forestall any disappointment by saying right now that it isn&#8217;t), there was a third-year course in group theory, taught by John Thompson no less, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3848&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I have a confession to make. When I was an undergraduate at Cambridge (hmm, that sounds as though it might be the beginning of quite an interesting confession, so I&#8217;d better forestall any disappointment by saying right now that it isn&#8217;t), there was a third-year course in group theory, taught by <a href="http://en.wikipedia.org/wiki/John_G._Thompson">John Thompson</a> no less, on which I did not do very well. For a few weeks it seemed to cover material that we&#8217;d done in our first year, and then suddenly it got serious, with things like the Sylow theorems. And at that point I got lost, and was unable to do the questions on the examples sheets. I can&#8217;t remember much about the questions, but I think my difficulty was that there was a slightly indirect style of proof that caused me to find arguments hard to remember and even harder to come up with. And I never got round to doing anything about it: I went into a different area of maths, and even now I don&#8217;t know the proofs of the Sylow theorems. In fact, I don&#8217;t even know the statements, though I know they&#8217;re about the existence of subgroups of various cardinalities, and I know that they are proved using cleverly defined group actions. I&#8217;ve skim-read the proofs, so I have a fairly good idea of their flavour, but I don&#8217;t know the details. In particular, I don&#8217;t know <em>which</em> action does the job.<br />
<span id="more-3848"></span></p>
<p>This post is an experiment. I&#8217;m going to try to come up with a group action that will tell me about the existence of a subgroup of some given cardinality, and see whether I end up formulating and proving one of the Sylow theorems. And I&#8217;ll try to record all my thoughts as I do so. The aim of this is not to give you a nice slick proof of a Sylow theorem &#8212; far from it &#8212; but to give an idea of how one might search for a group action that achieves some specific purpose.</p>
<p>Perhaps I ought to say more about the Sylow theorems. They are sometimes presented as a kind of converse to Lagrange&#8217;s theorem. Lagrange&#8217;s theorem tells us that every subgroup of a group has order that divides the order of the whole group. But if we&#8217;re given a factor of the order of the group, can we find a subgroup with that order? At least one of Sylow&#8217;s theorems tells us that for certain factors we can.</p>
<p>Since that is a (not yet fully worked out) statement about <em>general</em> groups rather than groups that are defined in a particular way, we are not given any external objects on which the group acts. So we can&#8217;t define an action in terms of some action that is already around. It would appear therefore that our only choice is to come up with an <em>intrisic</em> action &#8212; that is, one that is defined in terms of the group itself. To put that another way, an intrinsic group action is one that you could define for a group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> about which you know absolutely nothing at all. (Well, that&#8217;s not quite true &#8212; I&#8217;m going to assume that it&#8217;s finite.)</p>
<p>Before we go any further, let&#8217;s list a few intrinsic group actions. To begin with, let&#8217;s think about what actions there are of a group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on itself. That is, let&#8217;s consider actions where the set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> on which <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> acts is the set of elements of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. A very obvious action is given by left multiplication. That is, if <img src='http://s0.wp.com/latex.php?latex=a%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in G' title='a&#92;in G' class='latex' />, we define a function <img src='http://s0.wp.com/latex.php?latex=%5Cphi_a%3AG%5Cto+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi_a:G&#92;to G' title='&#92;phi_a:G&#92;to G' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=%5Cphi_a%28g%29%3Dag&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi_a(g)=ag' title='&#92;phi_a(g)=ag' class='latex' />. (If you prefer the product notation you could let <img src='http://s0.wp.com/latex.php?latex=X%3DG&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X=G' title='X=G' class='latex' /> and define the action, which is now a function from <img src='http://s0.wp.com/latex.php?latex=G%5Ctimes+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G&#92;times X' title='G&#92;times X' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />, by <img src='http://s0.wp.com/latex.php?latex=%28a%2Cg%29%5Cmapsto+ag&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,g)&#92;mapsto ag' title='(a,g)&#92;mapsto ag' class='latex' />.)</p>
<p>It doesn&#8217;t take much to come up with right multiplication as another action. That is, we can define <img src='http://s0.wp.com/latex.php?latex=%5Cpsi_a%28g%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi_a(g)' title='&#92;psi_a(g)' class='latex' /> to be <img src='http://s0.wp.com/latex.php?latex=ga&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ga' title='ga' class='latex' />. But is that an action? Let&#8217;s check. What is <img src='http://s0.wp.com/latex.php?latex=%5Cpsi_%7Bab%7D%28g%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi_{ab}(g)' title='&#92;psi_{ab}(g)' class='latex' />? It&#8217;s <img src='http://s0.wp.com/latex.php?latex=gab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gab' title='gab' class='latex' />. And what is <img src='http://s0.wp.com/latex.php?latex=%5Cpsi_a%5Cpsi_b%28g%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi_a&#92;psi_b(g)' title='&#92;psi_a&#92;psi_b(g)' class='latex' />? It&#8217;s <img src='http://s0.wp.com/latex.php?latex=gba&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gba' title='gba' class='latex' />. Oops. </p>
<p>What we would really like is to insert that <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> after the <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> and before the <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' />. And there is in fact a way of doing that &#8230; sort of. We define <img src='http://s0.wp.com/latex.php?latex=%5Cpsi_a%28g%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi_a(g)' title='&#92;psi_a(g)' class='latex' /> to be <img src='http://s0.wp.com/latex.php?latex=ga%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ga^{-1}' title='ga^{-1}' class='latex' /> instead. Let&#8217;s do the check again. This time, <img src='http://s0.wp.com/latex.php?latex=%5Cpsi_%7Bab%7D%28g%29%3Dg%28ab%29%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi_{ab}(g)=g(ab)^{-1}' title='&#92;psi_{ab}(g)=g(ab)^{-1}' class='latex' />, and <img src='http://s0.wp.com/latex.php?latex=%5Cpsi_a%5Cpsi_b%28g%29%3Dgb%5E%7B-1%7Da%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi_a&#92;psi_b(g)=gb^{-1}a^{-1}' title='&#92;psi_a&#92;psi_b(g)=gb^{-1}a^{-1}' class='latex' />. This time it checks out.</p>
<p>And that neatly brings us to a third action, which is where we do the first two at the same time. That is, we define <img src='http://s0.wp.com/latex.php?latex=%5Crho_a%28g%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;rho_a(g)' title='&#92;rho_a(g)' class='latex' /> to be <img src='http://s0.wp.com/latex.php?latex=aga%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='aga^{-1}' title='aga^{-1}' class='latex' />. This is the <em>conjugation action</em> of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on itself, so called for the obvious reason that <img src='http://s0.wp.com/latex.php?latex=%5Crho_a%28g%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;rho_a(g)' title='&#92;rho_a(g)' class='latex' /> is what you get when you conjugate <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />. (For weeks now I&#8217;ve been meaning to write a post on conjugation. I&#8217;ll try to get round to it at some point.)</p>
<p>The first two actions aren&#8217;t interestingly different, because <img src='http://s0.wp.com/latex.php?latex=%5Cphi_a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi_a' title='&#92;phi_a' class='latex' /> does to <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> what <img src='http://s0.wp.com/latex.php?latex=%5Cpsi_a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi_a' title='&#92;psi_a' class='latex' /> does to <img src='http://s0.wp.com/latex.php?latex=g%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g^{-1}' title='g^{-1}' class='latex' />. To put that more precisely, let <img src='http://s0.wp.com/latex.php?latex=%5Cbeta%3AX%5Cto+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;beta:X&#92;to X' title='&#92;beta:X&#92;to X' class='latex' /> be the map that takes each element to its inverse. Then <img src='http://s0.wp.com/latex.php?latex=%5Cbeta&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;beta' title='&#92;beta' class='latex' /> is a bijection, and <img src='http://s0.wp.com/latex.php?latex=%5Cpsi_a%28g%29%3D%5Cbeta%5E%7B-1%7D%5Cphi_a%28%5Cbeta+g%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi_a(g)=&#92;beta^{-1}&#92;phi_a(&#92;beta g)' title='&#92;psi_a(g)=&#92;beta^{-1}&#92;phi_a(&#92;beta g)' class='latex' />. (Also, as it happens, <img src='http://s0.wp.com/latex.php?latex=%5Cbeta%5E%7B-1%7D%3D%5Cbeta&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;beta^{-1}=&#92;beta' title='&#92;beta^{-1}=&#92;beta' class='latex' />, but that is not the main point here.)</p>
<p>However, the first action is completely different from the third. For example, for the left-multiplication action, there is precisely one orbit, whereas the orbits of the conjugation action are conjugacy classes. (If this isn&#8217;t obvious to you, that can only be because you haven&#8217;t properly internalized the definitions. I recommend sitting down to prove it, and after a few seconds you&#8217;ll see that when you write down what the orbit of a point is, you have written out the definition of the conjugacy class of that point.)</p>
<p>The left-multiplication action just defined doesn&#8217;t feel as though it is going to be all that helpful, because in some sense it just <em>is</em> the group. In fact, if we write it in the product way, then it takes <img src='http://s0.wp.com/latex.php?latex=%28a%2Cg%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,g)' title='(a,g)' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=ag&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ag' title='ag' class='latex' />, so it really is nothing other than the group operation, so any information we can glean from it, we can glean from the group itself. And it&#8217;s also hard to see what we can learn from the conjugation action that we couldn&#8217;t learn just by talking directly about conjugacy classes.</p>
<p>However, all that changes if we change the set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> on which these operations act. Instead of taking <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> to be <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />, we can build something else out of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. Here are a few examples of actions derived from left multiplication in an obvious way.</p>
<p>1. If we happen to know that <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> has a subgroup <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />, then we can take the obvious left-multiplication action of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on the set of all left cosets of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. (This isn&#8217;t really an intrinsic action in the sense I mean, because <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is an &#8220;external&#8221; object.)</p>
<p>2. We could define an action on the set of all ordered pairs <img src='http://s0.wp.com/latex.php?latex=%28g%2Ch%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(g,h)' title='(g,h)' class='latex' /> of elements of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=%5Cphi_a%28g%2Ch%29%3D%28ag%2Cah%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi_a(g,h)=(ag,ah)' title='&#92;phi_a(g,h)=(ag,ah)' class='latex' />. </p>
<p>3. We could define an action on the set of all subsets of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> by taking <img src='http://s0.wp.com/latex.php?latex=%5Cphi_a%28E%29%3DaE&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi_a(E)=aE' title='&#92;phi_a(E)=aE' class='latex' />. Here <img src='http://s0.wp.com/latex.php?latex=aE&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='aE' title='aE' class='latex' /> is defined to be <img src='http://s0.wp.com/latex.php?latex=%5C%7Bag%3Ag%5Cin+E%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{ag:g&#92;in E&#92;}' title='&#92;{ag:g&#92;in E&#92;}' class='latex' />. </p>
<p>4. We could define an action on the set of all partitions of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> into sets <img src='http://s0.wp.com/latex.php?latex=E_1%2C%5Cdots%2CE_r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E_1,&#92;dots,E_r' title='E_1,&#92;dots,E_r' class='latex' /> of sizes <img src='http://s0.wp.com/latex.php?latex=k_1%2C%5Cdots%2Ck_r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k_1,&#92;dots,k_r' title='k_1,&#92;dots,k_r' class='latex' /> by taking <img src='http://s0.wp.com/latex.php?latex=%5Cphi_a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi_a' title='&#92;phi_a' class='latex' /> of the partition <img src='http://s0.wp.com/latex.php?latex=%5C%7BE_1%2C%5Cdots%2CE_r%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{E_1,&#92;dots,E_r&#92;}' title='&#92;{E_1,&#92;dots,E_r&#92;}' class='latex' /> to be the partition <img src='http://s0.wp.com/latex.php?latex=%5C%7BaE_1%2C%5Cdots%2CaE_r%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{aE_1,&#92;dots,aE_r&#92;}' title='&#92;{aE_1,&#92;dots,aE_r&#92;}' class='latex' />. </p>
<p>One thing I <em>can</em> remember is the proof of Cauchy&#8217;s theorem: that if <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is a prime that divides the order of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> has a subgroup of order <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. The action there is a bit different, since it&#8217;s an action not of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> but of the cyclic group <img src='http://s0.wp.com/latex.php?latex=C_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_p' title='C_p' class='latex' /> on a set that is derived from <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. To be precise, it acts on the set of all <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />-tuples <img src='http://s0.wp.com/latex.php?latex=%28a_0%2Ca_1%2C%5Cdots%2Ca_%7Bp-1%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a_0,a_1,&#92;dots,a_{p-1})' title='(a_0,a_1,&#92;dots,a_{p-1})' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=a_0a_1%5Cdots+a_%7Bp-1%7D%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_0a_1&#92;dots a_{p-1}=e' title='a_0a_1&#92;dots a_{p-1}=e' class='latex' />, and it acts by cyclically permuting the <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />-tuples. So I&#8217;ll have to bear in mind that we may be looking for that kind of action rather than an action of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. However, for now let&#8217;s press on.</p>
<p>How am I going to be able to use a left-multiplication action of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> to tell me anything about the existence of a subgroup of some cardinality? Let&#8217;s take the cardinality to be <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />.</p>
<p>The one that looks most promising to start with is the action of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on the set of all subsets of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />, for the simple reason that a subgroup of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is in particular a subset of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />. Of course, a subgroup isn&#8217;t just any old subset of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />, but one of a very particular kind. If we want to make progress, we&#8217;d ideally like to say what is special about subgroups using concepts like orbits and stabilizers. So let&#8217;s begin by thinking about what the orbit of a subgroup looks like when we act on the left.</p>
<p>If <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is a subgroup, then every set <img src='http://s0.wp.com/latex.php?latex=a+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a H' title='a H' class='latex' /> is a left coset of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />, and one thing we know about left cosets of subgroups is that they partition the group. In terms of orbits, that tells us that the orbit of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> under the left-multiplication action has cardinality <img src='http://s0.wp.com/latex.php?latex=%7CG%7C%2F%7CH%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|/|H|' title='|G|/|H|' class='latex' />. We also know that the stabilizer of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> itself, which of course has cardinality <img src='http://s0.wp.com/latex.php?latex=%7CH%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|H|' title='|H|' class='latex' />. (Incidentally, the orbit-stabilizer theorem implies Lagrange&#8217;s theorem, as these observations show.)</p>
<p>But do these facts distinguish subgroups from other subsets? Might there be a subset <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> that is not a subgroup, and yet its orbit has size <img src='http://s0.wp.com/latex.php?latex=%7CG%7C%2Fk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|/k' title='|G|/k' class='latex' />?</p>
<p>Oops, that was the wrong question, since <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> could be a left coset of a subgroup. So let&#8217;s ask whether <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> has to be a left coset of a subgroup. </p>
<p>It feels very much as though we should use the orbit-stabilizer theorem here. Why? Because to show that something has size <img src='http://s0.wp.com/latex.php?latex=%7CG%7C%2Fk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|/k' title='|G|/k' class='latex' /> looks tricky, whereas to show that something has size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />, when the object we start with already has size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />, looks potentially a lot easier.</p>
<p>Because I prefer thinking about subgroups to thinking about left cosets, I&#8217;m going to make the additional assumption that <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> contains the identity (so that if it is a left coset then it has to be a subgroup). It should be possible to &#8220;translate the whole discussion&#8221; later.</p>
<p>If <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> is a set of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> that contains the identity, then what can we say about the stabilizer of <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' />? Well, for <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> to belong to the stabilizer of <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' />, we need <img src='http://s0.wp.com/latex.php?latex=aE%3DE&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='aE=E' title='aE=E' class='latex' />, and in particular, we need <img src='http://s0.wp.com/latex.php?latex=ae%3Da&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ae=a' title='ae=a' class='latex' /> to belong to <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' />. Therefore, the only possible elements of the stabilizer of <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> are elements of <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' />. But do they belong to the stabilizer? Yes, if and only if <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> is closed under left multiplication, which is true if and only if <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> is a subgroup.</p>
<p>How do we &#8220;translate that discussion&#8221;? Let&#8217;s just fix any old element of <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> and call it <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' />. Now we can say with complete confidence that if <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> belongs to the stabilizer, then <img src='http://s0.wp.com/latex.php?latex=ag%5Cin+E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ag&#92;in E' title='ag&#92;in E' class='latex' />, from which it follows that <img src='http://s0.wp.com/latex.php?latex=a%5Cin+Eg%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in Eg^{-1}' title='a&#92;in Eg^{-1}' class='latex' />. For the stabilizer to have size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />, we therefore need it to equal <img src='http://s0.wp.com/latex.php?latex=Eg%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Eg^{-1}' title='Eg^{-1}' class='latex' />, so we need the following to be true: if <img src='http://s0.wp.com/latex.php?latex=h%2Ch%27%5Cin+E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h,h&#039;&#92;in E' title='h,h&#039;&#92;in E' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=hg%5E%7B-1%7Dh%27%5Cin+E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg^{-1}h&#039;&#92;in E' title='hg^{-1}h&#039;&#92;in E' class='latex' />. </p>
<p>We&#8217;re hoping to show that this implies that <img src='http://s0.wp.com/latex.php?latex=E%3DgH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E=gH' title='E=gH' class='latex' /> for some subgroup <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />, so let&#8217;s see what this condition tells us about <img src='http://s0.wp.com/latex.php?latex=g%5E%7B-1%7DE&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g^{-1}E' title='g^{-1}E' class='latex' />. If <img src='http://s0.wp.com/latex.php?latex=h%2Ch%27%5Cin+E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h,h&#039;&#92;in E' title='h,h&#039;&#92;in E' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=g%5E%7B-1%7Dh%2Cg%5E%7B-1%7Dh%27%5Cin+g%5E%7B-1%7DE&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g^{-1}h,g^{-1}h&#039;&#92;in g^{-1}E' title='g^{-1}h,g^{-1}h&#039;&#92;in g^{-1}E' class='latex' />, and the condition above tells us that <img src='http://s0.wp.com/latex.php?latex=g%5E%7B-1%7Dhg%5E%7B-1%7Dh%27%5Cin+g%5E%7B-1%7DE&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g^{-1}hg^{-1}h&#039;&#92;in g^{-1}E' title='g^{-1}hg^{-1}h&#039;&#92;in g^{-1}E' class='latex' />. So <img src='http://s0.wp.com/latex.php?latex=g%5E%7B-1%7DE&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g^{-1}E' title='g^{-1}E' class='latex' /> is closed under multiplication, just as we wanted.</p>
<p>All this suggests a proof strategy. We could consider the left-multiplication action of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on the set of all subsets of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />, and try to show that at least one orbit has size <img src='http://s0.wp.com/latex.php?latex=%7CG%7C%2Fk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|/k' title='|G|/k' class='latex' />, under appropriate conditions on <img src='http://s0.wp.com/latex.php?latex=%7CG%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|' title='|G|' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />, whatever they might turn out to be. </p>
<p>But how are we going to do that? A pretty obvious starting point is that we&#8217;ll want <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> to divide <img src='http://s0.wp.com/latex.php?latex=%7CG%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|' title='|G|' class='latex' />, so let&#8217;s assume that without further comment.</p>
<p>What else can we say? We know that the orbits form a partition of the collection of subsets of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />, so it might help to know how many of those there are. Indeed, it definitely would, because that would open up the possibility of using the orbit-stabilizer theorem.</p>
<p>Can we say in advance how the orbit-stabilizer theorem might conceivably come in useful? Well, one thing we know is that the stabilizer of a subset is going to be a subgroup of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />, so its size will be a factor of <img src='http://s0.wp.com/latex.php?latex=%7CG%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|' title='|G|' class='latex' />. This puts a restriction on the possible sizes of the stabilizers, and also, by the orbit-stabilizer theorem, on the possible sizes of the orbits. So perhaps we can argue somehow that in order to get those sizes to add up to the total number of subsets of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> we&#8217;ve got to have at least some orbits of size <img src='http://s0.wp.com/latex.php?latex=%7CG%7C%2Fk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|/k' title='|G|/k' class='latex' />.</p>
<p>Of course, the number of sets of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> is just <img src='http://s0.wp.com/latex.php?latex=%5Cbinom+%7B%7CG%7C%7Dk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;binom {|G|}k' title='&#92;binom {|G|}k' class='latex' />, so the real question is whether we can say anything interesting about that number.</p>
<p>Here, I have to admit that I&#8217;m drawing on my mathematical experience, and possibly also on dim memories of what the proofs looked like, which result in my strongly expecting that divisibility properties are going to be relevant. You&#8217;ll know from Numbers and Sets, for example, that if <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is a prime and <img src='http://s0.wp.com/latex.php?latex=1%5Cleq+a%5Cleq+p-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1&#92;leq a&#92;leq p-1' title='1&#92;leq a&#92;leq p-1' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=%5Cbinom+pa&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;binom pa' title='&#92;binom pa' class='latex' /> is divisible by <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. </p>
<p><strong>Edit:</strong> <em>The next four paragraphs are not very clear and contain at least one false statement. I have written a better account in <a href="http://gowers.wordpress.com/2011/12/10/group-actions-iv-intrinsic-actions/#comment-14341">this comment</a>. Thanks to Joseph for drawing my attention to this, and apologies to anyone else who has struggled with the original version.</em></p>
<p>That fact can be considerably generalized, so let me pause to discuss it for a while. My favourite proof of the fact itself is one that can be expressed in terms of group actions if you want. Consider the action of the cyclic group <img src='http://s0.wp.com/latex.php?latex=C_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_p' title='C_p' class='latex' /> on all its subsets of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />. For convenience, let&#8217;s think of <img src='http://s0.wp.com/latex.php?latex=C_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_p' title='C_p' class='latex' /> as the group <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_p' title='&#92;mathbb{Z}_p' class='latex' /> of integers mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> under addition. So if <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> is a subset of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=%5Cphi_aE&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi_aE' title='&#92;phi_aE' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=a%2BE&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a+E' title='a+E' class='latex' />. I claim that the stabilizer of <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> is just <img src='http://s0.wp.com/latex.php?latex=0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='0' title='0' class='latex' />. Indeed, if <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> is any non-zero element of the stabilizer and <img src='http://s0.wp.com/latex.php?latex=x%5Cin+E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in E' title='x&#92;in E' class='latex' />, then all of <img src='http://s0.wp.com/latex.php?latex=x%2C+x%2Ba%2C+x%2B2a%2C+%5Cdots&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x, x+a, x+2a, &#92;dots' title='x, x+a, x+2a, &#92;dots' class='latex' /> have to belong to <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' />, which tells us that <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> is the whole group. But we&#8217;re assuming that <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> is non-empty and not equal to the whole group.</p>
<p>What happens if we look at an arbitrary pair of numbers <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> and try to run the same argument? That is, we consider the left action of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_n' title='&#92;mathbb{Z}_n' class='latex' /> on its subsets of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />. Let <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> be one of those subsets, and, as before, let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> be an element of <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' />. Then, again as before, if <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> is in the stablizer of <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' />, the elements <img src='http://s0.wp.com/latex.php?latex=x%2C+x%2Ba%2C+x%2B2a%2C%5Cdots&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x, x+a, x+2a,&#92;dots' title='x, x+a, x+2a,&#92;dots' class='latex' /> must all belong to <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' />. But if <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> isn&#8217;t prime, we can no longer deduce that the numbers <img src='http://s0.wp.com/latex.php?latex=x%2C+x%2Ba%2C+x%2B2a%2C%5Cdots&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x, x+a, x+2a,&#92;dots' title='x, x+a, x+2a,&#92;dots' class='latex' /> run through the whole of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_n' title='&#92;mathbb{Z}_n' class='latex' />. </p>
<p>However, we can say exactly what they <em>do</em> run through &#8212; this is a bit of Numbers and Sets. If <img src='http://s0.wp.com/latex.php?latex=d%3D%28a%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d=(a,n)' title='d=(a,n)' class='latex' />, then they run through all elements of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_n' title='&#92;mathbb{Z}_n' class='latex' /> that differ from <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> by a multiple of <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' />, of which there are <img src='http://s0.wp.com/latex.php?latex=n%2Fd&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/d' title='n/d' class='latex' />. That is, we take the subgroup <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_n' title='&#92;mathbb{Z}_n' class='latex' /> generated by <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> and then we know that if <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> is in the stabilizer of <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> contains an element <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />, then the entire coset <img src='http://s0.wp.com/latex.php?latex=x%2BH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x+H' title='x+H' class='latex' /> lies in <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> as well. So <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> must be a union of cosets of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />.</p>
<p>This isn&#8217;t possible unless <img src='http://s0.wp.com/latex.php?latex=%7CE%7C%3Dk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|E|=k' title='|E|=k' class='latex' /> is a multiple of <img src='http://s0.wp.com/latex.php?latex=n%2Fd&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/d' title='n/d' class='latex' />. That is, <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> can&#8217;t be in the stabilizer of <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> unless <img src='http://s0.wp.com/latex.php?latex=n%2F%28a%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/(a,n)' title='n/(a,n)' class='latex' /> is a factor of <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />, or equivalently <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is a factor of <img src='http://s0.wp.com/latex.php?latex=k%28a%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k(a,n)' title='k(a,n)' class='latex' />, or equivalently again <img src='http://s0.wp.com/latex.php?latex=%28a%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,n)' title='(a,n)' class='latex' /> is a multiple of <img src='http://s0.wp.com/latex.php?latex=n%2F%28n%2Ck%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/(n,k)' title='n/(n,k)' class='latex' />. In particular, if <img src='http://s0.wp.com/latex.php?latex=%28n%2Ck%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n,k)=1' title='(n,k)=1' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=%28a%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,n)' title='(a,n)' class='latex' /> has to be a multiple of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, which implies that <img src='http://s0.wp.com/latex.php?latex=a%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;equiv 0' title='a&#92;equiv 0' class='latex' />, so in this case the stabilizer of every set has size <img src='http://s0.wp.com/latex.php?latex=1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1' title='1' class='latex' /> and we can deduce, just as before, that <img src='http://s0.wp.com/latex.php?latex=%5Cbinom+nk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;binom nk' title='&#92;binom nk' class='latex' /> is a multiple of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />. More generally, if <img src='http://s0.wp.com/latex.php?latex=%28n%2Ck%29%3Dm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n,k)=m' title='(n,k)=m' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=%28a%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,n)' title='(a,n)' class='latex' /> has to be a multiple of <img src='http://s0.wp.com/latex.php?latex=n%2Fm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/m' title='n/m' class='latex' />. Since <img src='http://s0.wp.com/latex.php?latex=n%2Fm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/m' title='n/m' class='latex' /> is a factor of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, it follows that it is necessary and sufficient for <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> to be a multiple of <img src='http://s0.wp.com/latex.php?latex=n%2Fm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/m' title='n/m' class='latex' />.</p>
<p>There are <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> multiples of <img src='http://s0.wp.com/latex.php?latex=n%2Fm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/m' title='n/m' class='latex' />, which form a subgroup, so the stabilizer is a subgroup of a group of size <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />. Therefore, the orbit has size equal to a multiple of <img src='http://s0.wp.com/latex.php?latex=n%2Fm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/m' title='n/m' class='latex' />. It follows that the union of the orbits has size equal to a multiple of <img src='http://s0.wp.com/latex.php?latex=n%2Fm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/m' title='n/m' class='latex' />. Therefore, <img src='http://s0.wp.com/latex.php?latex=%5Cbinom+nk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;binom nk' title='&#92;binom nk' class='latex' /> is always a multiple of <img src='http://s0.wp.com/latex.php?latex=n%2F%28n%2Ck%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/(n,k)' title='n/(n,k)' class='latex' />. (It&#8217;s possible to give more or less the same argument but without using the language of orbits and stabilizers: you just define two sets of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> to be <em>cyclically equivalent</em> if one is a translate of the other, and you then argue that the equivalence classes must have sizes that are multiples of <img src='http://s0.wp.com/latex.php?latex=n%2F%28n%2Ck%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/(n,k)' title='n/(n,k)' class='latex' />.)</p>
<p>I seem to remember that at least one Sylow theorem has something to do with the highest power of a prime <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> that divides the order of the group. So let&#8217;s think briefly about what happens if <img src='http://s0.wp.com/latex.php?latex=n%3Dp%5Erm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=p^rm' title='n=p^rm' class='latex' />, where <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is a prime and <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. What should we take as our <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />? If we try <img src='http://s0.wp.com/latex.php?latex=p%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^r' title='p^r' class='latex' />, then we get that <img src='http://s0.wp.com/latex.php?latex=%28n%2Ck%29%3Dp%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n,k)=p^r' title='(n,k)=p^r' class='latex' />, so <img src='http://s0.wp.com/latex.php?latex=n%2F%28n%2Ck%29%3Dm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/(n,k)=m' title='n/(n,k)=m' class='latex' />. So we can deduce from that that <img src='http://s0.wp.com/latex.php?latex=%5Cbinom+n%7Bp%5Er%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;binom n{p^r}' title='&#92;binom n{p^r}' class='latex' /> is a multiple of <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />. </p>
<p>I don&#8217;t know whether that will be useful, but let&#8217;s press on and see whether we can show that there must be a subgroup of order <img src='http://s0.wp.com/latex.php?latex=p%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^r' title='p^r' class='latex' /> if <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is a group of order <img src='http://s0.wp.com/latex.php?latex=n%3Dp%5Erm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=p^rm' title='n=p^rm' class='latex' />. The plan is to consider the action of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on the collection of all subsets of size <img src='http://s0.wp.com/latex.php?latex=k%3Dp%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=p^r' title='k=p^r' class='latex' />. What we&#8217;d like to show is that at least one orbit has size <img src='http://s0.wp.com/latex.php?latex=n%2Fk%3Dm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/k=m' title='n/k=m' class='latex' />. What do we know about the orbits? Here are some facts that we can write down.</p>
<p>(i) The size of each orbit is a factor of <img src='http://s0.wp.com/latex.php?latex=n%3Dp%5Erm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=p^rm' title='n=p^rm' class='latex' />.</p>
<p>(ii) The sum of the sizes of all the orbits is a multiple of <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />.</p>
<p>Is there anything else we can say? Well, a rather simple fact is that the union of all the translates of a subset of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is the whole of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. So if that subset has size <img src='http://s0.wp.com/latex.php?latex=p%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^r' title='p^r' class='latex' />, then the number of distinct translates (which is the same as the size of the orbit of that set) must be at least <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />. So let&#8217;s add that to our list.</p>
<p>(iii) Every orbit has size at least <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />.</p>
<p>This is starting to look promising. Every factor of <img src='http://s0.wp.com/latex.php?latex=p%5Erm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^rm' title='p^rm' class='latex' /> that exceeds <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> is forced to be a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. So every orbit must have a size that is either a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> or equal to <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />. So one possible strategy would be to identify somehow an orbit that has a size that is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, or at least to show that such a thing must exist.</p>
<p>Now I&#8217;m feeling a bit stuck, because I don&#8217;t really see why such an orbit has to exist. Or do I? It suddenly occurs to me that there would be an easy argument if we knew that <img src='http://s0.wp.com/latex.php?latex=%5Cbinom+n%7Bp%5Er%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;binom n{p^r}' title='&#92;binom n{p^r}' class='latex' /> was not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. Is that true? I think it might be, so let&#8217;s do a tiny little experiment. Let&#8217;s take <img src='http://s0.wp.com/latex.php?latex=n%3D12%3D2%5E23&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=12=2^23' title='n=12=2^23' class='latex' /> and let&#8217;s calculate <img src='http://s0.wp.com/latex.php?latex=%5Cbinom%7B12%7D4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;binom{12}4' title='&#92;binom{12}4' class='latex' />. We get <img src='http://s0.wp.com/latex.php?latex=12%5Ctimes+11%5Ctimes+10%5Ctimes+9%2F4%5Ctimes+3%5Ctimes+2%5Ctimes+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='12&#92;times 11&#92;times 10&#92;times 9/4&#92;times 3&#92;times 2&#92;times 1' title='12&#92;times 11&#92;times 10&#92;times 9/4&#92;times 3&#92;times 2&#92;times 1' class='latex' />. If we cancel out 2s, we get <img src='http://s0.wp.com/latex.php?latex=3%5Ctimes+11%5Ctimes+5%5Ctimes+9&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3&#92;times 11&#92;times 5&#92;times 9' title='3&#92;times 11&#92;times 5&#92;times 9' class='latex' />, which is odd, just as we wanted.</p>
<p>At the moment, I can&#8217;t think of a nice argument that <img src='http://s0.wp.com/latex.php?latex=%5Cbinom%7Bp%5Erm%7D%7Bp%5Er%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;binom{p^rm}{p^r}' title='&#92;binom{p^rm}{p^r}' class='latex' /> is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> when <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> isn&#8217;t. And for the second paragraph in a row, no sooner do I express that kind of pessimism than an idea occurs to me. Let&#8217;s think about the orbits when <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_n' title='&#92;mathbb{Z}_n' class='latex' /> acts on the sets of size <img src='http://s0.wp.com/latex.php?latex=p%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^r' title='p^r' class='latex' /> (where <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is still <img src='http://s0.wp.com/latex.php?latex=p%5Erm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^rm' title='p^rm' class='latex' />). What I&#8217;m hoping is that plenty of these will have sizes that are multiples of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, and just a few won&#8217;t, so that we&#8217;ll be able to tell that the sum of the sizes of the orbits is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. </p>
<p>Let <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> be a set of size <img src='http://s0.wp.com/latex.php?latex=p%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^r' title='p^r' class='latex' />. The only way the orbit of <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> can fail to have size divisible by <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is if the stabilizer has size divisible by <img src='http://s0.wp.com/latex.php?latex=p%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^r' title='p^r' class='latex' />. But as we&#8217;ve already seen, the only way that the stabilizer of a set can have size equal to the size of the set itself is if the set is a coset of a subgroup. So the only way the orbit of <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> can fail to have size divisible by <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is if <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> is a coset of a subgroup of size <img src='http://s0.wp.com/latex.php?latex=p%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^r' title='p^r' class='latex' />. But the only subgroup of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_n' title='&#92;mathbb{Z}_n' class='latex' /> of size <img src='http://s0.wp.com/latex.php?latex=p%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^r' title='p^r' class='latex' /> is the subgroup that consists of all multiples of <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />. That tells us that there is precisely one orbit of size <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />, and we have already observed that all the rest have sizes that are multiples of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. Since <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> isn&#8217;t a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, we have shown that the sum of the sizes of all the orbits, which is <img src='http://s0.wp.com/latex.php?latex=%5Cbinom%7Bp%5Erm%7D%7Bp%5Er%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;binom{p^rm}{p^r}' title='&#92;binom{p^rm}{p^r}' class='latex' />, is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, just as we were trying to do.</p>
<p>The less nice argument I was contemplating was simply to show that the largest power of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> that divides <img src='http://s0.wp.com/latex.php?latex=p%5Erm-t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^rm-t' title='p^rm-t' class='latex' /> is the same as the largest power tat divides <img src='http://s0.wp.com/latex.php?latex=p%5Er-t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^r-t' title='p^r-t' class='latex' /> when <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> isn&#8217;t a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=0%5Cleq+t%3Cp%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='0&#92;leq t&lt;p^r' title='0&#92;leq t&lt;p^r' class='latex' />. I haven&#8217;t checked whether that&#8217;s true, but if it is, then it means that all the <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />s cancel when you work out the binomial coefficient. But I don&#8217;t like that proof (if it works) because it doesn&#8217;t tell me <em>why</em> the result is true, whereas the one I&#8217;ve just given does: it gives me a natural partition of the collection of subsets of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_n' title='&#92;mathbb{Z}_n' class='latex' /> of size <img src='http://s0.wp.com/latex.php?latex=p%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^r' title='p^r' class='latex' /> into a bunch of sets with sizes divisible by <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> and one set of size not divisible by <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. That&#8217;s a genuine explanation.</p>
<p>Where have we got to now? We know the following facts. Just in case there is any confusion, I&#8217;m now leaving behind <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_n' title='&#92;mathbb{Z}_n' class='latex' /> and going back to an arbitrary group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> of size <img src='http://s0.wp.com/latex.php?latex=p%5Erm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^rm' title='p^rm' class='latex' />. By &#8220;orbit&#8221; I mean orbit of the left action of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on the set of all subsets of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> of size <img src='http://s0.wp.com/latex.php?latex=p%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^r' title='p^r' class='latex' />.</p>
<p>Earlier we remarked the following.</p>
<p>(i) The size of each orbit is a factor of <img src='http://s0.wp.com/latex.php?latex=n%3Dp%5Erm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=p^rm' title='n=p^rm' class='latex' />.</p>
<p>(ii) The sum of the sizes of all the orbits is a multiple of <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />.</p>
<p>(iii) Every orbit has size at least <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />.</p>
<p>To these three facts, we can now add one more.</p>
<p>(iv) The sum of the sizes of all the orbits is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> (because it equals <img src='http://s0.wp.com/latex.php?latex=%5Cbinom+%7Bp%5Erm%7D%7Bp%5Er%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;binom {p^rm}{p^r}' title='&#92;binom {p^rm}{p^r}' class='latex' />, which we&#8217;ve just shown isn&#8217;t a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />).</p>
<p>Also, we used (i) and (iii) earlier to deduce the following (which is why we bothered to prove (iv)).</p>
<p>(v) Every orbit either has size <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> or has size that is a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />.</p>
<p>But from (iv) and (v) it follows that at least one orbit must have size <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />. If <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> is a set in that orbit, then the stabilizer of <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> has size <img src='http://s0.wp.com/latex.php?latex=p%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^r' title='p^r' class='latex' />, so <em>we&#8217;ve found a subgroup of size</em> <img src='http://s0.wp.com/latex.php?latex=p%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^r' title='p^r' class='latex' />!</p>
<p>As usual, if you remove all the how-did-I-think-of-that discussion you end up with a much shorter argument, but it also leaves you with mysterious and unmotivated steps. Suppose you were trying to retain this proof in your head while actively memorizing as little as you possibly could. What might you go for? Here&#8217;s a possibility.</p>
<p>1. Amongst all subsets of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />, those that are cosets of subgroups can be defined in terms of sizes of orbits of the left-multiplication action. [Then it's just a question of working out the details, which is not too bad.]</p>
<p>2. If <img src='http://s0.wp.com/latex.php?latex=n%3Dp%5Erm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=p^rm' title='n=p^rm' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=k%3Dp%5Er&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=p^r' title='k=p^r' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=%5Cbinom+nk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;binom nk' title='&#92;binom nk' class='latex' /> is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. </p>
<p>3. To prove that, think carefully about orbits of the left-multiplication action of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_n' title='&#92;mathbb{Z}_n' class='latex' /> on its subsets of size <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />. [You'll then find that you've done most of the work while thinking about 1.]</p>
<p>4. Now write down what you can about the orbits of the same action when the group is a general group of order <img src='http://s0.wp.com/latex.php?latex=p%5Erm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^rm' title='p^rm' class='latex' />. Use the orbit-stabilizer theorem and the observation that every orbit has size at least <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />. </p>
<p>5. By this time you&#8217;ll have written down enough information for the result to follow easily.</p>
<p>Even that looks like quite a bit to commit to memory, but with experience one can reduce it further. For example, it should be a reflex action to apply the orbit-stabilizer theorem whenever you are thinking about group actions and the opportunity arises. Also, if you&#8217;re on the ball, then you don&#8217;t have to <em>remember</em> to show that <img src='http://s0.wp.com/latex.php?latex=%5Cbinom+nk&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;binom nk' title='&#92;binom nk' class='latex' /> is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, since the need to do that arises from the proof. Or perhaps I should say that it&#8217;s a standard trick: if you&#8217;ve got a bunch of numbers and you want to show that at least one of them is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, then it&#8217;s enough to show that their sum is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. </p>
<p>This post has ended up being much more specific than its title promised. I haven&#8217;t talked in general about intrinsic group actions, but have instead concentrated on just one proof and discussed how one can come up naturally with an appropriate action (given a small list of possible actions to draw on). Just before I finish, I&#8217;m going to look back at <a href="http://gowers.wordpress.com/2011/11/09/group-actions-ii-the-orbit-stabilizer-theorem/#comment-13464">a comment on my previous post but one on group actions</a>, which I deliberately didn&#8217;t read properly because I knew I was going to write this. It will be interesting to see whether I have come up with precisely the same statement and proof.</p>
<p>It seems that I have, but when looking at the comment I find that I have a bizarre experience that I often have when reading maths: that even if I know an argument well, somebody else&#8217;s presentation of it is often quite hard to understand (even if they&#8217;ve presented it perfectly well, as is the case here). Anyhow, it&#8217;s clearly the same argument, and it demonstrates very clearly what I said about the actual argument being much much shorter than the thought processes that give rise to it, though the comparison is slightly unfair because some of the details were left out in that comment.</p>
<p>I&#8217;m going to stop here, but if I were more conscientious I would continue with an application of an action based on conjugation. Instead, I refer you to the same comment, which briefly mentions an application of this kind. I also refer you to <a href="http://en.wikipedia.org/wiki/Sylow_theorems">the Wikipedia article on the Sylow theorems</a> if you&#8217;re interested. (This post has proved the first of the three Sylow theorems.)</p>
<p>I now see that what I&#8217;ve just talked about in this post was covered in question 13 of Examples Sheet 3. So let me stress that what I&#8217;m interested in here is different. That question tells you exactly what to do &#8212; consider such-and-such an action, apply the orbit-stabilizer theorem, etc. &#8212; whereas here I&#8217;ve attempted to start from scratch without any of those hints. I&#8217;m hoping that that will give a better idea of how to think of instrinsic group actions than you get if you&#8217;re fed them out of a spoon.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3848/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3848/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3848/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3848/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3848/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3848/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3848/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3848/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3848/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3848/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3848/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3848/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3848/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3848/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3848&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/12/10/group-actions-iv-intrinsic-actions/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>A short post on countability and uncountability</title>
		<link>http://gowers.wordpress.com/2011/11/28/a-short-post-on-countability-and-uncountability/</link>
		<comments>http://gowers.wordpress.com/2011/11/28/a-short-post-on-countability-and-uncountability/#comments</comments>
		<pubDate>Mon, 28 Nov 2011 15:44:02 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[IA Numbers and Sets]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3833</guid>
		<description><![CDATA[There is plenty I could write about countability and uncountability, but much of what I have to say I have said already in written form, and I don&#8217;t see much reason to rewrite it. So here&#8217;s a link to two articles on the Tricki, which, if you don&#8217;t know, is a wiki for mathematical techniques. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3833&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>There is plenty I could write about countability and uncountability, but much of what I have to say I have said already in written form, and I don&#8217;t see much reason to rewrite it. So here&#8217;s a link to two articles on <a href="http://www.tricki.org/">the Tricki</a>, which, if you don&#8217;t know, is a wiki for mathematical techniques. The Tricki hasn&#8217;t taken off, and probably never will, but it&#8217;s still got some useful material on it that you might enjoy looking at. The articles in question are one about <a href="http://www.tricki.org/article/A_quick_way_of_recognising_countable_sets">how to tell almost instantly whether a set is countable</a> and another about <a href="http://www.tricki.org/article/A_good_way_of_proving_that_a_set_is_countable">how to find neat proofs that sets are countable when they are</a>.<br />
<span id="more-3833"></span></p>
<p>The main additional point I&#8217;d like to make about this whole area is that you will do much better if you follow some of the general advice from earlier in this series of posts and <em>work from the formal definitions and basic facts that you have been taught</em>. Perhaps I can make that clearer by spelling out what you shouldn&#8217;t do, which is to pay too much attention to the words &#8220;countable&#8221; and &#8220;uncountable&#8221;. Let&#8217;s face it, you can&#8217;t count the natural numbers &#8212; you&#8217;ll be dead long before you&#8217;ve got to <img src='http://s0.wp.com/latex.php?latex=10%5E%7B20%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='10^{20}' title='10^{20}' class='latex' />. You can&#8217;t even put them in a list, since the number of atoms in the universe is only around <img src='http://s0.wp.com/latex.php?latex=10%5E%7B79%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='10^{79}' title='10^{79}' class='latex' />. And if you imagine some hypothetical world where you live for ever, you&#8217;ll never actually finish counting through the natural numbers if you try to do so (unless you find some way of speeding up without limit so that the sum of the times you take converges, but let&#8217;s not go there). So if you think of countable as meaning &#8220;can be counted&#8221;, then you risk confusing yourself &#8212; and I know for a fact that many people do end up confusing themselves.</p>
<p>Far better to stick to basic facts and definitions that are stated in precise mathematical language. Here&#8217;s a list of them &#8212; I may forget one or two important ones but I&#8217;ll try not to. You should have all these facts at your fingertips. (If you can prove the ones that aren&#8217;t definitions, then so much the better, but knowing the facts is even more important than knowing the proofs, since it is the facts themselves that you will use to go on to prove other things.)</p>
<p>Incidentally, some people use the convention that all finite sets are countable, whereas others use the word &#8220;countable&#8221; only for infinite sets. I&#8217;ll use the convention that finite sets are countable, so if you prefer the other convention then you&#8217;ll have to make some small modifications.</p>
<p>1. A set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is <em>finite</em> if for some <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> there is a bijection <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AX%5Cto%5C%7B1%2C2%2C%5Cdots%2Cn%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:X&#92;to&#92;{1,2,&#92;dots,n&#92;}' title='&#92;phi:X&#92;to&#92;{1,2,&#92;dots,n&#92;}' class='latex' />. Otherwise, it is <em>infinite</em>.</p>
<p>2. A set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is <em>countable</em> if and only if it is finite or there is a bijection <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AX%5Cto%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:X&#92;to&#92;mathbb{N}' title='&#92;phi:X&#92;to&#92;mathbb{N}' class='latex' />. Otherwise, it is <em>uncountable</em>.</p>
<p>3. Two sets <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> are said to have <em>the same cardinality</em> if there is a bijection <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AX%5Cto+Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:X&#92;to Y' title='&#92;phi:X&#92;to Y' class='latex' />.</p>
<p>4. If <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> are sets, then <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> <em>has cardinality at most that of</em> <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> if there is an injection <img src='http://s0.wp.com/latex.php?latex=%5Cpsi%3AX%5Cto+Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi:X&#92;to Y' title='&#92;psi:X&#92;to Y' class='latex' />. </p>
<p>5. If <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> are sets and <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is non-empty, then the following two statements are equivalent.</p>
<p>(i) There is an injection from <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' />.</p>
<p>(ii) There is a surjection from <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />.</p>
<p>6. Let <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> be an infinite set. The following statements are equivalent.</p>
<p>(i) There is a bijection from <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' />.</p>
<p>(ii) There is an injection from <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' />.</p>
<p>(iii) There is a surjection from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />.</p>
<p>[Note that this gives three potential ways of proving that <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is countable. Although I gave (i) as the definition of countability, it is usually much more convenient to prove (ii).]</p>
<p>7. <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> is uncountable.</p>
<p>8. If <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is any set, then the power set of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> has strictly larger cardinality than <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />. (Equivalently, there is no surjection from <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> to the power set of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />.) In particular, the power set of an infinite set is uncountable.</p>
<p>9. A union of countably many countable sets is countable. More formally, if <img src='http://s0.wp.com/latex.php?latex=%5CGamma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;Gamma' title='&#92;Gamma' class='latex' /> is a countable set and for each <img src='http://s0.wp.com/latex.php?latex=%5Cgamma%5Cin%5CGamma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;gamma&#92;in&#92;Gamma' title='&#92;gamma&#92;in&#92;Gamma' class='latex' /> the set <img src='http://s0.wp.com/latex.php?latex=X_%5Cgamma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X_&#92;gamma' title='X_&#92;gamma' class='latex' /> is countable, then the union <img src='http://s0.wp.com/latex.php?latex=%5Cbigcup_%7B%5Cgamma%5Cin%5CGamma%7DX_%5Cgamma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;bigcup_{&#92;gamma&#92;in&#92;Gamma}X_&#92;gamma' title='&#92;bigcup_{&#92;gamma&#92;in&#92;Gamma}X_&#92;gamma' class='latex' /> is countable.</p>
<p>10. In particular, a union of countably many finite sets is countable. If you are told to prove that a set is countable, then using this very simple principle usually leads to the shortest proof.</p>
<p>11. If <img src='http://s0.wp.com/latex.php?latex=n%3Em&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&gt;m' title='n&gt;m' class='latex' /> then there is no injection from the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cn%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,n&#92;}' title='&#92;{1,2,&#92;dots,n&#92;}' class='latex' /> to the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cm%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,m&#92;}' title='&#92;{1,2,&#92;dots,m&#92;}' class='latex' /> (and hence no surjection from the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cm%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,m&#92;}' title='&#92;{1,2,&#92;dots,m&#92;}' class='latex' /> to the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cn%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,n&#92;}' title='&#92;{1,2,&#92;dots,n&#92;}' class='latex' />).</p>
<p>[This may look obvious, but it needs a proof. One way to do it is to use the well-ordering principle. Pick a counterexample with <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> minimal. Let <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> be an injection from <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cn%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,n&#92;}' title='&#92;{1,2,&#92;dots,n&#92;}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cm%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,m&#92;}' title='&#92;{1,2,&#92;dots,m&#92;}' class='latex' />. If <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28n%29%3Dj&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(n)=j' title='&#92;phi(n)=j' class='latex' />, then define <img src='http://s0.wp.com/latex.php?latex=%5Cpsi%3A%5C%7B1%2C%5Cdots%2Cn-1%5C%7D%5Cto%5C%7B1%2C%5Cdots%2Cm-1%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi:&#92;{1,&#92;dots,n-1&#92;}&#92;to&#92;{1,&#92;dots,m-1&#92;}' title='&#92;psi:&#92;{1,&#92;dots,n-1&#92;}&#92;to&#92;{1,&#92;dots,m-1&#92;}' class='latex' /> by taking <img src='http://s0.wp.com/latex.php?latex=%5Cpsi%28r%29%3D%5Cphi%28r%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi(r)=&#92;phi(r)' title='&#92;psi(r)=&#92;phi(r)' class='latex' /> if <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28r%29%3Cj&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(r)&lt;j' title='&#92;phi(r)&lt;j' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cpsi%28r%29%3D%5Cphi%28r%29-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi(r)=&#92;phi(r)-1' title='&#92;psi(r)=&#92;phi(r)-1' class='latex' /> if <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28r%29%3Ej&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(r)&gt;j' title='&#92;phi(r)&gt;j' class='latex' />. This is an injection, which contradicts the minimality of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />.]</p>
<p>Here are a couple of examples of how to do exercises that involve countability. </p>
<p><strong>1.</strong> <em>Prove that if <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> is countable and <img src='http://s0.wp.com/latex.php?latex=f%3AX%5Cto+Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:X&#92;to Y' title='f:X&#92;to Y' class='latex' /> is an injection, then <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is countable.</em></p>
<p><strong>Solution.</strong> Since <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> is countable, there is an injection <img src='http://s0.wp.com/latex.php?latex=g%3AY%5Cto%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g:Y&#92;to&#92;mathbb{N}' title='g:Y&#92;to&#92;mathbb{N}' class='latex' />. A composition of injections is an injection, so <img src='http://s0.wp.com/latex.php?latex=g%5Ccirc+f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;circ f' title='g&#92;circ f' class='latex' /> is an injection from <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' />. Therefore, <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is countable. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p>Note how short and clean the above proof is. Note also that what I did <em>not</em> do was say anything about &#8220;putting the elements of <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> in a list&#8221;.</p>
<p><strong>2.</strong> <em>Let <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> be an uncountable set and let <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> be an injection from <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> to another set <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' />. Prove that <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> is uncountable.</em></p>
<p><strong>Solution.</strong> Since uncountability is defined negatively, it will be no surprise that we prove this result by looking at the contrapositive. If <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> is countable, then by the previous exercise <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is countable, contradicting our hypothesis. So <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> is uncountable. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p><strong>3.</strong> <em>Prove that the set of all irrational numbers is uncountable.</em></p>
<p><strong>Solution 1.</strong> There are various ways of doing this. The easiest argument starts from the thought that the reals form a huge set, and to get the irrationals we take away just the rationals, which form a small set. Therefore, the irrationals must form a huge set. </p>
<p>To turn that into a proper proof, we once again prove the contrapositive &#8212; for the same reason as we did when solving question 2. If the set of all irrationals is countable, then the reals are the union of two countable sets. Hence, by fact 9 above, the reals are countable. But that contradicts fact 7. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /> </p>
<p><strong>Solution 2.</strong> It is tempting to try to use the solution to 2 above. That is, we&#8217;d like to find a set that we know is uncountable, and define an injection from that set to the irrationals. The most obvious uncountable set is the set of reals. Can we inject those to the set of irrationals? Hmm, it seems hard, since nothing that&#8217;s even slightly continuous has any hope of working. Are there any &#8220;less continuous&#8221; uncountable sets that we could use? Yes: we could take the set of all 01 sequences. So now we&#8217;d like a way of associating an irrational with each 01 sequence. This is fun to do, so here&#8217;s a spoiler alert. I&#8217;ll leave some space, then I&#8217;ll give a solution in just one paragraph, then I&#8217;ll leave some more space, and then I&#8217;ll present a third solution.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
Bearing in mind that all rational numbers have repeating decimal expansions, we just need some way of associating a non-repeating sequence with each 01 sequence. This can be done in many natural ways, of which one is this. To each sequence associate a decimal between 0 and 1 that is 0 in the <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />th decimal place whenever <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is not a square, and is either 1 or 2 in the square places according to whether the 01 sequence is 0 or 1. For example, if the 01 sequence begins 00101, then the decimal will begin 0.10010000200000010000000020&#8230; <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /><br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
<strong>Solution 3.</strong> This is just for people who know a little about continued fractions. If you haven&#8217;t, then don&#8217;t worry about it &#8212; though if you read the beginning of <a href="http://en.wikipedia.org/wiki/Continued_fraction">the Wikipedia article on the subject</a> then that will be more than enough to understand what follows.</p>
<p>Continued fractions give a very beautiful bijection between the set of all positive irrational numbers and the set of all infinite sequences of natural numbers. Given a positive irrational number, you just take the terms of its continued-fraction expansion, and given an infinite sequence, you just take the number that has those terms, which must be irrational since all rational numbers have terminating continued-fraction expansions. It is easy to prove that the set of all infinite sequences of natural numbers is uncountable, so we&#8217;re done. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<hr />
<p>Sometimes one is asked to prove that a set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is uncountable when you&#8217;re not told what <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is, but just that it has certain properties. This is slightly harder to deal with. I&#8217;m not going to work through an example, because I don&#8217;t want to spoil what may be a nice examples sheet question from next term. However, here is a technique that can sometimes work very nicely. You define, using information about <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />, a function <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> that takes finite sequences of 0s and 1s to points in <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />, and you do it in such a way that for any infinite sequence, the images of its initial segments form a sequence that you prove converges to something in <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />. (For instance, if the infinite sequence starts 110101&#8230; then the sequence of points in <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> starts <img src='http://s0.wp.com/latex.php?latex=%5Cphi%281%29%2C%5Cphi%2811%29%2C%5Cphi%28110%29%2C%5Cphi%281101%29%2C%5Cphi%2811010%29%2C%5Cphi%28110101%29%2C...&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(1),&#92;phi(11),&#92;phi(110),&#92;phi(1101),&#92;phi(11010),&#92;phi(110101),...' title='&#92;phi(1),&#92;phi(11),&#92;phi(110),&#92;phi(1101),&#92;phi(11010),&#92;phi(110101),...' class='latex' />.) You also do the construction in a way that ensures that no two limits are the same.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3833/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3833/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3833/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3833/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3833/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3833/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3833/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3833/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3833/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3833/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3833/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3833/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3833/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3833/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3833&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/11/28/a-short-post-on-countability-and-uncountability/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Group actions III &#8212; what&#8217;s the point of them?</title>
		<link>http://gowers.wordpress.com/2011/11/25/group-actions-iii-whats-the-point-of-them/</link>
		<comments>http://gowers.wordpress.com/2011/11/25/group-actions-iii-whats-the-point-of-them/#comments</comments>
		<pubDate>Fri, 25 Nov 2011 11:07:56 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[IA Groups]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3811</guid>
		<description><![CDATA[Somebody told me recently that a few years ago they had a supervision with a colleague of mine (who shall remain nameless, but he or she is an applied mathematician) and asked what the point of group actions was. &#8220;I have absolutely no idea,&#8221; was the response, and the implication that one might draw from [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3811&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Somebody told me recently that a few years ago they had a supervision with a colleague of mine (who shall remain nameless, but he or she is an applied mathematician) and asked what the point of group actions was. &#8220;I have absolutely no idea,&#8221; was the response, and the implication that one might draw from it was apparently intended.</p>
<p>No pure mathematician could hold such a view. I&#8217;ve stated a few times that group actions tell you a lot about groups. In this post I want to try to explain why that is, though there is far more to say than I am capable of explaining, let alone fitting into one blog post.</p>
<p>Several proofs that use group actions seem to depend on almost magically coming up with an action that just happens, when you analyse it the right way, to tell you what you wanted to know. I am not an algebraist and do not have a good all-purpose method for finding actions to prove given statements. I don&#8217;t rule out that such a method might exist, at least for reasonably simple statements, and would be interested to hear from anybody who thinks they can usefully add to what I have to say.<br />
<span id="more-3811"></span></p>
<p>In the absence of a systematic method for coming up with group actions for specific purposes, the next best thing is to have a good supply of actions that one can simply look through in the hope of finding something useful. Very often, features of the problem will enable you to rule out many actions as unhelpful, which reduces the brute-force search to something manageable. Although this isn&#8217;t as systematic as the general method I described when discussing how one might come up with the notion of a quotient group (which was, roughly speaking, to pretend you&#8217;ve found the object you&#8217;re looking for and say so much about it that you end up determining it uniquely), it comes fairly close.</p>
<p><strong>A supply of group actions.</strong></p>
<p>I&#8217;ll divide these into <em>intrinsic</em> and <em>extrinsic</em> actions. By an intrinsic group action I mean one that is defined in a general abstract way in terms of the group itself, and by an extrinsic group action I mean one that is defined in terms of other objects that come into a particular description of the group.</p>
<p>Rather than attempt to make these definitions more precise, let me illustrate them with all the examples I can think of. There are bound to be many more, so again I&#8217;d be grateful if people want to suggest further examples.</p>
<p><strong>Extrinsic group actions.</strong></p>
<p>I&#8217;ll start with some extrinsic examples. A common way for these to arise is if the group is given to you as a group of symmetries, or more generally a group of transformations of some set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />. Of course, that already gives us an action before we even start: the group acts on the set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />. However, it potentially gives us many more actions.</p>
<p>As an example, let&#8217;s consider the symmetry group of a cube. As defined, this group acts on the entire cube, but we soon spot that certain subsets of the cube are <em>invariant</em> under the action &#8212; this means that if you take a point in the subset and apply one of the symmetries you will get another point in the set. For instance, the set of centres of faces is invariant, since every symmetry takes the centre of a face to the centre of some other (or possibly the same) face. So, writing <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> for the symmetry group of the cube, we find that <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> acts on the set of centres of faces.</p>
<p>Does this tell us anything interesting about <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />? Well, it tells us that <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is a subgroup of <img src='http://s0.wp.com/latex.php?latex=S_6&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_6' title='S_6' class='latex' />, but that&#8217;s only mildly interesting. One of the reasons for the limited interest is that this action is faithful &#8212; that is, different symmetries permute the centres of the faces in different ways &#8212; so regarding <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> as acting on the centres of the faces rather than on the cube feels a bit too close to the original description of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. </p>
<p>However, things get a bit more interesting if we spot that opposite points go to opposite points. This means that we can partition the six centres of faces into three sets, each consisting of two opposite points, and think of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> as acting on the set that consists of these three <em>pairs</em> of points. Here are two other ways of saying what I&#8217;ve just said. We could regard two face centres as equivalent if they are opposite each other and say that <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> acts on the equivalence classes. Or if we wanted to be more geometrical, we could join the opposite centres by line segments and say that <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> acts on the three line segments.</p>
<p>This time the action isn&#8217;t faithful, since a half turn takes each line segment (to use the geometrical viewpoint) to itself, as does a reflection in the plane through the centre that&#8217;s parallel to one (and hence two) of the faces. Earlier, I implied that that would be more interesting, so let&#8217;s try to think what it could tell us.</p>
<p>Another point that I&#8217;ve been trying to put across in these posts is that</p>
<li><em>kernels of homomorphisms are normal subgroups are kernels of homomorphisms</em></li>
<p>and yet another is that</p>
<li><em>a group action is a homomorphism from a group to a group of transformations of a set.</em></li>
<p>Now a group action is faithful if and only if the homomorphism to the group of transformations is an injection, which (as is easy to show and I&#8217;m sure you&#8217;ve seen) is the case if and only if the kernel of the homomorphism consists of just the identity element. So if we put the two points together, then we get a third point, and it&#8217;s a useful one:</p>
<li><em>a non-trivial action that is not faithful gives rise to a non-trivial normal subgroup.</em></li>
<p>Why is this? Well, the action is a homomorphism <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />. The kernel of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is a normal subgroup. If the action isn&#8217;t faithful, then its kernes isn&#8217;t the identity, and if the action doesn&#8217;t just send every element of the group to the identity (that&#8217;s what I mean by the action being non-trivial) then its kernel isn&#8217;t the whole group. </p>
<p>Now let&#8217;s go back to the example we had of an unfaithful action. It was the action of the symmetry group of the cube on the set of three line segments that join opposite centres of faces. This has a non-trivial kernel, which is generated by the three reflections that keep two of the lines fixed and flip the other one round. This group is <img src='http://s0.wp.com/latex.php?latex=C_2%5Ctimes+C_2%5Ctimes+C_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_2&#92;times C_2&#92;times C_2' title='C_2&#92;times C_2&#92;times C_2' class='latex' />, so we find that the symmetry group of the cube has <img src='http://s0.wp.com/latex.php?latex=C_2%5Ctimes+C_2%5Ctimes+C_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_2&#92;times C_2&#92;times C_2' title='C_2&#92;times C_2&#92;times C_2' class='latex' /> as a normal subgroup. Also, the isomorphism theorem tells us that the quotient group is equal to the image of the homomorphism of which we have just calculated the kernel. Since it is possible to permute the three segments however you like, this image is <img src='http://s0.wp.com/latex.php?latex=S_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_3' title='S_3' class='latex' />, so the quotient is <img src='http://s0.wp.com/latex.php?latex=S_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_3' title='S_3' class='latex' />. (Incidentally, it may be worth my mentioning a mistake that you could conceivably make here, which is that you can&#8217;t multiply both sides of <img src='http://s0.wp.com/latex.php?latex=G%2FH%3DK&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G/H=K' title='G/H=K' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. In our case, it is not true that the symmetry group of the cube is isomorphic to <img src='http://s0.wp.com/latex.php?latex=C_2%5Ctimes+C_2%5Ctimes+C_2%5Ctimes+S_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_2&#92;times C_2&#92;times C_2&#92;times S_3' title='C_2&#92;times C_2&#92;times C_2&#92;times S_3' class='latex' />. Something of the kind is true, but the &#8220;product&#8221; is more complicated than the straightforward group product.)</p>
<p>We&#8217;ve therefore used the action of the symmetry group on the three line segments to obtain information about the group itself. It&#8217;s not that hard to show that the subgroup generated by those three reflections is closed under conjugation, but if you want to know that it&#8217;s a normal subgroup, then it&#8217;s much easier just to observe that the group acts on those three line segments and that the subgroup is precisely the kernel of that action (when the action is considered as a homomorphism). </p>
<p>What other actions of this group might be interesting? Well, we quite like actions on small sets, since those are more likely not to be faithful. So one thing we could try to do is look for small orbits (of the action on the entire cube). The reason I said that is that if we put a point <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> into the set on which we&#8217;d like <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> to act, then obviously we have to put all the other points in the orbit of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />. And if you think about it, that&#8217;s exactly what we did with the centres of the faces: once one of them was in there, we had to put in the others too.</p>
<p>That suggests that we can&#8217;t do much better, if we&#8217;re going for small sets, than what we&#8217;ve done already. (Well, we could make <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> act on the centre of the cube, but that&#8217;s a trivial action, so uninteresting.) But let&#8217;s remember the second idea we had, which was to observe that certain relationships between points are preserved by the transformations in <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. In the case of the previous action, we noted that opposite points go to opposite points, which led us to split them up into three pairs. </p>
<p>We can do exactly the same with a different set of segments, this time the four segments that join a vertex of the cube to the opposite vertex. It&#8217;s clear that <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> acts on these four main diagonals, so what does that tell us about <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />?</p>
<p>Let&#8217;s think first of all about whether the action is faithful. Is there any symmetry of the cube that is not the identity but that nevertheless preserves all four diagonals? Yes there is: the function that takes each point to the opposite point, which is sometimes called reflection through the centre. (If your cube is sitting in <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D%5E3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}^3' title='&#92;mathbb{R}^3' class='latex' /> with its centre at the origin, then the symmetry is the map that takes each vector <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=-x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='-x' title='-x' class='latex' />.) A bit of further thought reveals that this is the only map, apart from the identity, that preserves the four diagonals. To see why, note that each vertex must go either to itself or to the opposite vertex. Since neighbouring vertices must also go to neighbouring vertices, once you&#8217;ve decided where one vertex goes, you&#8217;ve decided where all its neighbours go, and all their neighbours, and all their neighbours &#8212; which has ended up including all vertices.</p>
<p>So the kernel of the action of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on the four diagonals is the cyclic group <img src='http://s0.wp.com/latex.php?latex=C_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_2' title='C_2' class='latex' />. What about its image? That is, which permutations of the four diagonals can arise if you perform a symmetry of the cube?</p>
<p>A preliminary question we might ask ourselves is whether there are any restrictions at all. The way I like to approach a question like this is to ask myself whether there are any relationships that sometimes hold and sometimes don&#8217;t. That&#8217;s a bit of a vague statement (deliberately, because the idea works quite generally), but we get a good example of it if we go back to the set of eight vertices. Why can&#8217;t we permute them arbitrarily? Because some pairs of vertices are next to each other and others aren&#8217;t. Some pairs of vertices are opposite each other and others aren&#8217;t. These relationships are preserved by the symmetries of a cube, so we can&#8217;t, for instance, take two neighbouring vertices to two vertices that are not next to each other. </p>
<p>This doesn&#8217;t seem to work with diagonals, since for every pair of diagonals you can find two opposite edges such that the two diagonals join their two end points (making, with the edges themselves, a kind of bow-tie shape). That doesn&#8217;t prove that we can get all permutations of the diagonals, but it starts to suggest it, so let&#8217;s look in the other direction. </p>
<p>If you want to show that you can get an arbitrary permutation, then a good way of doing it is to show that you can get all <em>transpositions</em>, since these generate the symmetric group. So we find ourselves asking this: given two main diagonals of the cube, is there a symmetry that exchanges those diagonals and sends the other two diagonals to themselves? You might like to draw a picture here, but let me try to explain in words why the answer is yes. You take the plane that contains the other two diagonals and reflect in it. That obviously keeps the other two diagonals where they were, and it must interchange the two diagonals you want to exchange, since otherwise the map would have to be the identity by the argument we used earlier to identify the kernel of the action of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on the diagonals. (Alternatively, just check that this reflection really does swap the diagonals it&#8217;s supposed to swap.) </p>
<p>A reflection isn&#8217;t the only way of swapping just two diagonals. Indeed, it can&#8217;t be because the kernel of the action is non-trivial. Since we know what the kernel is, we know that minus the reflection we&#8217;ve just defined must also swap the two diagonals. A more geometrical way of thinking of it is as a half turn about an axis perpendicular to the plane that contains the two diagonals you want to send to themselves.</p>
<p>We have now shown that the image of the action is the permutation group of the four diagonals, so it is isomorphic to <img src='http://s0.wp.com/latex.php?latex=S_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_4' title='S_4' class='latex' />. By the isomorphism theorem, we find that <img src='http://s0.wp.com/latex.php?latex=G%2FC_2%3DS_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G/C_2=S_4' title='G/C_2=S_4' class='latex' />. </p>
<p>The fact that this particular copy of <img src='http://s0.wp.com/latex.php?latex=C_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_2' title='C_2' class='latex' /> (consisting of the identity and the map that sends every point to the opposite point) is a normal subgroup of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is easy to see directly, since it is given by the matrix <img src='http://s0.wp.com/latex.php?latex=-I&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='-I' title='-I' class='latex' /> (where <img src='http://s0.wp.com/latex.php?latex=I&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='I' title='I' class='latex' /> is the <img src='http://s0.wp.com/latex.php?latex=3%5Ctimes+3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3&#92;times 3' title='3&#92;times 3' class='latex' /> identity matrix), which commutes with all other <img src='http://s0.wp.com/latex.php?latex=3%5Ctimes+3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3&#92;times 3' title='3&#92;times 3' class='latex' /> matrices. Because of that, it follows that in this case we can &#8220;multiply through&#8221; and see that <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is isomorphic to <img src='http://s0.wp.com/latex.php?latex=C_2%5Ctimes+S_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_2&#92;times S_4' title='C_2&#92;times S_4' class='latex' />. [Edit: that's not quite enough of a reason -- see <a href="http://gowers.wordpress.com/2011/11/25/group-actions-iii-whats-the-point-of-them/#comment-13918">Greg Martin's comment below</a> and my response to it.] Moreover, every symmetry of the cube is plus or minus a rotation, so the rotation group of the cube is isomorphic to <img src='http://s0.wp.com/latex.php?latex=S_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_4' title='S_4' class='latex' />, a fact we can see another way by noting that the rotation group also acts on the four main diagonals, this time faithfully, and the image of the action is <img src='http://s0.wp.com/latex.php?latex=S_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_4' title='S_4' class='latex' />.</p>
<p>I invite you to think how you might see that the group of rotations of the cube is isomorphic to <img src='http://s0.wp.com/latex.php?latex=S_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_4' title='S_4' class='latex' /> without using the concept of group actions. If you see an easy argument, then let me know. But it&#8217;s sort of obvious that if you want to see it, then the best way is to find a set of four objects that the rotations permute.</p>
<p>Have we exhausted the interesting actions that we can build out of the action of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on the cube? I&#8217;m afraid not. Recall that our techniques so far are these. </p>
<p>(i) Consider the action of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on the orbit of a point (ideally chosen in a nice &#8220;symmetrical place&#8221; so that its orbit isn&#8217;t too big).</p>
<p>(ii) Find an equivalence relation on an orbit with the property that the transformations in <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> take equivalent points to equivalent points, and look at the resulting action of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on the equivalence classes.</p>
<p>Let&#8217;s continue to think about what happens when <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is the symmetry group of a cube and the point is a vertex. The orbit of that point is the set of all eight vertices of the cube. What equivalence relations are there on this set with the property that equivalent vertices go to equivalent vertices? There are two trivial ones (one where there is just one equivalence class and one where the equivalence classes are singletons), and one that we have already considered (where two vertices are considered equivalent when they are opposite each other). Are there any more? </p>
<p>Let&#8217;s answer this question using the pretend-you&#8217;ve-got-it-already technique. So let&#8217;s suppose that <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> is an equivalence relation on the set of vertices and that if <img src='http://s0.wp.com/latex.php?latex=v%5Csim+w&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='v&#92;sim w' title='v&#92;sim w' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=T&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='T' title='T' class='latex' /> is a symmetry of the cube then <img src='http://s0.wp.com/latex.php?latex=Tv%5Csim+Tw&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Tv&#92;sim Tw' title='Tv&#92;sim Tw' class='latex' />. Given a pair of vertices <img src='http://s0.wp.com/latex.php?latex=%28v%2Cw%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(v,w)' title='(v,w)' class='latex' />, what other pairs of vertices can arise as <img src='http://s0.wp.com/latex.php?latex=%28Tv%2CTw%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(Tv,Tw)' title='(Tv,Tw)' class='latex' />? The answer is that the distance between vertices never changes as a result of a symmetry, and that if the distance between <img src='http://s0.wp.com/latex.php?latex=t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t' title='t' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=u&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u' title='u' class='latex' /> is the same as the distance between <img src='http://s0.wp.com/latex.php?latex=v&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='v' title='v' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=w&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='w' title='w' class='latex' />, then there is a symmetry <img src='http://s0.wp.com/latex.php?latex=T&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='T' title='T' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=Tv%3Dt&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Tv=t' title='Tv=t' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Tw%3Du&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Tw=u' title='Tw=u' class='latex' />. </p>
<p>From this it follows that <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> can depend only on the distance between <img src='http://s0.wp.com/latex.php?latex=v&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='v' title='v' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=w&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='w' title='w' class='latex' />. Incidentally, there are at least two sensible notions of distance here. One is just the physical distance in <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D%5E3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}^3' title='&#92;mathbb{R}^3' class='latex' />, but an easier one to take is the number of edges you have to walk along to get from one vertex to the other, so for instance the distance from a vertex to the opposite vertex is 3. It&#8217;s this notion of distance that I&#8217;ll be talking about in the next paragraph. </p>
<p>Since <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> is transitive, if <img src='http://s0.wp.com/latex.php?latex=v%5Csim+w&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='v&#92;sim w' title='v&#92;sim w' class='latex' /> for any two vertices at distance <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' />, then we also know that <img src='http://s0.wp.com/latex.php?latex=t%5Csim+u&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t&#92;sim u' title='t&#92;sim u' class='latex' /> whenever you can get from <img src='http://s0.wp.com/latex.php?latex=t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t' title='t' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=u&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u' title='u' class='latex' /> by a series of steps, each of which takes you a distance <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' />. For example, if two neighbouring vertices are equivalent, then since you can get from any vertex to any other vertex by walking along edges, it follows that all vertices are equivalent. That isn&#8217;t very interesting. But if two vertices that are distance 2 apart (so they are opposite vertices of some face), then that tells you that two vertices are equivalent if they can be reached by a series of diagonal jumps across faces. And if you think about it for a bit, you&#8217;ll see that if you start at a vertex <img src='http://s0.wp.com/latex.php?latex=v&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='v' title='v' class='latex' />, then you <em>can&#8217;t</em> reach all other vertices if you take steps of size 2. The set of vertices you can reach forms (the vertices of) a regular tetrahedron, as does the set of vertices you can&#8217;t reach. So we&#8217;ve found another equivalence relation: two vertices are equivalent if they are an even distance apart (and therefore either 0 or 2 apart), and the equivalence classes can be thought of as two tetrahedra. </p>
<p>If we look at distance 3 then we get back to the partition into pairs of opposite vertices, so that won&#8217;t give us anything new. But the tetrahedra should interest us: they&#8217;ve given us another action (of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on the set that consists of these two tetrahedra).</p>
<p>Everything I&#8217;ve just said applies just as well if I consider the group of rotations of the cube rather than the full symmetry group. Since the full symmetry group is just the product of the rotation group by <img src='http://s0.wp.com/latex.php?latex=C_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_2' title='C_2' class='latex' />, I&#8217;ll concentrate on the rotation group instead, and I&#8217;ll use this action to say something about it. Let&#8217;s call the rotation group <img src='http://s0.wp.com/latex.php?latex=R&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R' title='R' class='latex' />. Since <img src='http://s0.wp.com/latex.php?latex=R&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R' title='R' class='latex' /> acts non-trivially on the group of permutations of the two tetrahedra, which is just <img src='http://s0.wp.com/latex.php?latex=S_2%3DC_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_2=C_2' title='S_2=C_2' class='latex' />, we know that we&#8217;ve got a surjective homomorphism from <img src='http://s0.wp.com/latex.php?latex=R&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R' title='R' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=C_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_2' title='C_2' class='latex' />.</p>
<p>Writing that makes me realize that I have some more principles to set out, about using group actions to find interesting quotients.</p>
<li><em>Quotient groups are images of homomorphisms are quotient groups.</em></li>
<li><em>Group actions are homomorphisms to groups of transformations of a set.</em></li>
<p>Putting those two principles together gives us this.</p>
<li><em>Non-trivial unfaithful group actions give rise to non-trivial quotients.</em></li>
<p>You might object that this is implicit in what I said before. As I pointed out then, non-trivial unfaithful group actions give rise to non-trivial normal subgroups, so the above principle follows if you simply quotient by the normal subgroups you obtain. But I&#8217;m actually saying slightly more than that: if you can identify the image of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> under the action, as you often can, then you can actually say what the quotient group is, rather than merely saying that it&#8217;s what you get when you quotient by the normal subgroup. </p>
<p>This is actually a general remark about the isomorphism theorem. Why is it useful? One reason is that it gives you a way of identifying quotient groups. If you&#8217;ve got some normal subgroup <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> of a group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> and want to describe <img src='http://s0.wp.com/latex.php?latex=G%2FH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G/H' title='G/H' class='latex' />, then a good strategy is to try to come up with a homomorphism defined on <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> that has kernel <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. Then the image of that homomorphism is isomorphic to the quotient group. A trivial way of carrying out this programme is to take the quotient map from <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=G%2FH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G/H' title='G/H' class='latex' />, but that&#8217;s not very helpful. What one wants to do is to come up with a less abstractly defined homomorphism with kernel <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. That does two nice things at once: it gives you an insight into why <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is normal, and it gives you a description of the quotient.</p>
<p>Anyhow, we&#8217;ve got a surjective homomorphism from <img src='http://s0.wp.com/latex.php?latex=R&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R' title='R' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=C_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_2' title='C_2' class='latex' />, so <img src='http://s0.wp.com/latex.php?latex=R&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R' title='R' class='latex' /> has a normal subgroup of index 2 and the quotient is isomorphic to <img src='http://s0.wp.com/latex.php?latex=C_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_2' title='C_2' class='latex' /> (as it must be since it has order 2). What is the normal subgroup? If you remember that <img src='http://s0.wp.com/latex.php?latex=R&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R' title='R' class='latex' /> is isomorphic to <img src='http://s0.wp.com/latex.php?latex=S_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_4' title='S_4' class='latex' /> it will come as no surprise to learn that the normal subgroup is <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' />.</p>
<p>Can we see this geometrically? Yes we can. The kernel of this particular action is the set of rotations that send the two tetrahedra to themselves, which, since there are only two tetrahedra, is the set of rotations that sends one of the tetrahedra to itself. In other words, it is isomorphic to the rotation group of a regular tetrahedron. It is fairly easy to show that using rotations you can achieve any even permutation of the vertices of a regular tetrahedron, but you can&#8217;t swap just two vertices. So that group is <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' />. </p>
<p>The general point I&#8217;m trying to demonstrate by saying all this is that we can really get inside the group of rotations (or the full symmetry group) of the cube by considering the various different actions that we can define in terms of the original action of the group on the cube. In a systematic way we can come up with several actions, and these actions tell us that the group is isomorphic to <img src='http://s0.wp.com/latex.php?latex=S_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_4' title='S_4' class='latex' /> (or <img src='http://s0.wp.com/latex.php?latex=S_4%5Ctimes+C_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_4&#92;times C_2' title='S_4&#92;times C_2' class='latex' />) as well us giving us descriptions of various normal subgroups and quotients.</p>
<p>This post is getting long enough that I think I&#8217;d better split it in two. But before I finish this part, let&#8217;s have a quick look at another group, namely <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' />. We could treat <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> as the group of rotational symmetries of a regular tetrahedron, but instead let us think of it in the way that it is usually defined &#8212; as the group of even permutations of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C3%2C4%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,3,4&#92;}' title='&#92;{1,2,3,4&#92;}' class='latex' />. </p>
<p>Once again we have a group that is defined in terms of an action on a set. So can we use that action to define another action? The orbit of any point is the whole set, so that doesn&#8217;t help much. What&#8217;s more, given any two (unordered) <em>pairs</em> of elements, there is an even permutation that takes one to the other, so any equivalence relation that is preserved by <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> would have to be trivial. So the equivalence-relations trick doesn&#8217;t work either.</p>
<p>Here&#8217;s a set on which <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> acts interestingly: its elements are the three partitions of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C3%2C4%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,3,4&#92;}' title='&#92;{1,2,3,4&#92;}' class='latex' /> into two sets of equal size. (These partitions could be represented in a shorthand notation as {12,34}, {13,24} and {14,23}.) If you do a permutation, then it sends the partition to the corresponding permuted elements. For example, if you do the permutation <img src='http://s0.wp.com/latex.php?latex=%28123%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(123)' title='(123)' class='latex' />, then it changes {12,34} to {23,14}, which is the same as the partition {14,23}.</p>
<p>It doesn&#8217;t take long to see that every cyclic permutation of the three partitions is possible and that if one partition is fixed, then so are the other two. Therefore, the image of this action is the cyclic group <img src='http://s0.wp.com/latex.php?latex=C_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_3' title='C_3' class='latex' />. Therefore, <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> has <img src='http://s0.wp.com/latex.php?latex=C_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_3' title='C_3' class='latex' /> as a quotient. What is the kernel of the action? We&#8217;ve just seen that a 3-cycle has a non-trivial effect, but if we try a transposition pair, we find that it sends each partition to itself. For instance, if we take the pair <img src='http://s0.wp.com/latex.php?latex=%2812%29%2834%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)(34)' title='(12)(34)' class='latex' />, then it sends {12,34} to {21,43}, {13,24} to {24,13} and {14,23} to {23,14}. In each case, I&#8217;ve just given a different notation for the partition we started with. It follows that the kernel is the identity together with the three transposition pairs. Therefore, this group, which is isomorphic to <img src='http://s0.wp.com/latex.php?latex=C_2%5Ctimes+C_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_2&#92;times C_2' title='C_2&#92;times C_2' class='latex' />, is a normal subgroup of <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> (a fact that you have almost certainly been shown, but quite possibly in a different way).</p>
<p>The idea of considering partitions might seem as though it sprang from nowhere. That is a common illusion in mathematics, which arises when an author neglects to mention the thoughts that provoke an idea. I confess that I&#8217;ve just done that, so let me give two other ways of coming up with essentially the same action, the first of which is the way that I used without saying so. </p>
<p>If we recall that <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> is isomorphic to the rotation group of a regular tetrahedron, then we can apply the same kinds of tricks that we used for the cube. And we find that the group acts on the three lines that connect midpoints of opposite edges. (I&#8217;ve mentioned this action before, I think.) Again, if one line is fixed, then the other two must be as well. So now we have a geometrical way of seeing that <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> has <img src='http://s0.wp.com/latex.php?latex=C_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_3' title='C_3' class='latex' /> as a quotient. The kernel is given by the identity and three half turns about the lines just specified. We could even think of the tetrahedron as one of the ones made out of alternate vertices of a cube, in which case the three lines are the coordinate axes. I then wanted to forget the geometry, so I needed some kind of object made out of the elements of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C3%2C4%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,3,4&#92;}' title='&#92;{1,2,3,4&#92;}' class='latex' /> that could stand in for the lines. Since each line joins the midpoints of two opposite edges, and two opposite edges have disjoint sets of vertices at the end of them, it was natural to associate lines with partitions of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C3%2C4%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,3,4&#92;}' title='&#92;{1,2,3,4&#92;}' class='latex' /> into two sets of size 2. Thus, for instance, if we think of the numbers 1 to 4 as labels of vertices, we label the edges by pairs of such numbers, and the line joining the centre of the edge 13 to the centre of the edge 24 will be represented by the partition {13,24}.</p>
<p>A second way of thinking about it is one that I want to touch on only briefly because it is better thought of as an <em>intrinsic</em> action. However, it seems wrong not to mention it. The group <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' />, like all groups, acts on itself by conjugation. You may remember that I discussed towards the end of <a href="http://gowers.wordpress.com/2011/10/16/permutations/">a post on permutations</a> how to conjugate permutations. The quick thing to remember is that if you want to calculate the cycle representation of <img src='http://s0.wp.com/latex.php?latex=%5Csigma%5Crho%5Csigma%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma&#92;rho&#92;sigma^{-1}' title='&#92;sigma&#92;rho&#92;sigma^{-1}' class='latex' /> then you take the cycle representation of <img src='http://s0.wp.com/latex.php?latex=%5Crho&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;rho' title='&#92;rho' class='latex' /> and apply <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' /> to everything inside it. (This is because if <img src='http://s0.wp.com/latex.php?latex=%5Crho%28x%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;rho(x)=y' title='&#92;rho(x)=y' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=%5Csigma%5Crho%5Csigma%5E%7B-1%7D%28%5Csigma%28x%29%29%3D%5Csigma%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma&#92;rho&#92;sigma^{-1}(&#92;sigma(x))=&#92;sigma(y)' title='&#92;sigma&#92;rho&#92;sigma^{-1}(&#92;sigma(x))=&#92;sigma(y)' class='latex' />.) </p>
<p>A corollary of this observation is that conjugation preserves cycle type. But as I&#8217;ve just said, every group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> acts on itself by conjugation: with each element <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> we consider the function <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%3AG%5Cto+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g):G&#92;to G' title='&#92;phi(g):G&#92;to G' class='latex' /> that takes <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=ghg%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ghg^{-1}' title='ghg^{-1}' class='latex' /> (which gives us a homomorphism <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> from <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> to the group of automorphisms of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />). Since conjugation preserves cycle type, if <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is a group of permutations, then any two permutations in the same orbit of the conjugation action must be of the same cycle type.</p>
<p>Now this observation is particularly nice when it comes to <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' />, because there is one cycle type, namely that of <img src='http://s0.wp.com/latex.php?latex=%2812%29%2834%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)(34)' title='(12)(34)' class='latex' />, that is shared by only three permutations. So that gives us a strong chance of finding a non-trivial homomorphism, as indeed we do. The conjugation action of <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> permutes the three transposition pairs (which, you cannot fail to have noticed, correspond in an obvious way to partitions of <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C3%2C4%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,3,4&#92;}' title='&#92;{1,2,3,4&#92;}' class='latex' /> into two sets of size 2), and the rest of the discussion is basically the same as the one I gave just now.</p>
<p>As a final thought, let&#8217;s consider what happens if we look at the action of <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> on the set of 3-cycles. There are eight 3-cycles, which turn out to split into two conjugacy classes. (Why two? Well, if you want to conjugate <img src='http://s0.wp.com/latex.php?latex=%28123%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(123)' title='(123)' class='latex' /> so that it becomes <img src='http://s0.wp.com/latex.php?latex=%28132%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(132)' title='(132)' class='latex' />, then the obvious way of doing it is to use a permutation <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' /> that takes 1 to 1, 2 to 3 and 3 to 2. But the only such permutation is <img src='http://s0.wp.com/latex.php?latex=%2823%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(23)' title='(23)' class='latex' />, which is odd. That&#8217;s not quite your only option because <img src='http://s0.wp.com/latex.php?latex=%28132%29%3D%28321%29%3D%28213%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(132)=(321)=(213)' title='(132)=(321)=(213)' class='latex' />, but the other two possibilities fail for similar reasons &#8212; as they must, because they are derived from the first by composing with an even permutation.) So if we take one of these conjugacy classes, say the one that contains <img src='http://s0.wp.com/latex.php?latex=%28123%29%2C+%28142%29%2C+%28134%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(123), (142), (134)' title='(123), (142), (134)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%28243%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(243)' title='(243)' class='latex' />, we find that <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> acts on it. If you look in detail at what that action does, you find that it is faithful: you can get any even permutation of those four 3-cycles. In fact, you&#8217;ll find a rather nice pattern: if you associate each 3-cycle with the element that it doesn&#8217;t do anything to &#8212; so, for example, you associate <img src='http://s0.wp.com/latex.php?latex=%28134%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(134)' title='(134)' class='latex' /> with 2 &#8212; then the effect of conjugating by an even permutation is to permute the 3-cycles according to that correspondence. For example, the permutation <img src='http://s0.wp.com/latex.php?latex=%28123%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(123)' title='(123)' class='latex' /> fixes <img src='http://s0.wp.com/latex.php?latex=%28123%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(123)' title='(123)' class='latex' /> when you conjugate, and sends <img src='http://s0.wp.com/latex.php?latex=%28243%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(243)' title='(243)' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%28134%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(134)' title='(134)' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%28142%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(142)' title='(142)' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%28243%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(243)' title='(243)' class='latex' />, which in terms of the points missed out is taking 1 to 2 to 3 to 1. However, even if this pattern is nice, it doesn&#8217;t help us (as far as I can see) work anything further out about the group structure of <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' />.</p>
<p>As you might suspect the pattern has a geometrical explanation. If we label the vertices, then we can think of a 3-cycle as a face <em>with an orientation</em>. Roughly speaking, an orientation is a little circle with an arrow on it that tells you which way of going round you regard as clockwise (which isn&#8217;t obvious in the abstract, since you can look at a plane from two different sides). So the 3-cycle <img src='http://s0.wp.com/latex.php?latex=%28123%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(123)' title='(123)' class='latex' /> represents the face with vertices 1, 2 and 3, together with the instruction that if you want to spin round clockwise then go the same way round as the path from 1 to 2 to 3 to 1. Each face has two possible orientations, so there are eight oriented faces. Moreover, the rotations of a regular tetrahedron act in an obvious way on oriented faces, and can never take an oriented face to the same face with the opposite orientation. So we have a geometrical interpretation of the fact that the eight 3-cycles split up into two conjugacy classes.</p>
<p><strong>Summary.</strong></p>
<p>Not for the first time in this series of posts, I have been surprised by how much there is to say, even when I thought I was restricting the scope considerably. This post will have a second part in which I discuss intrinsically defined actions, and I&#8217;ll try to provide some kind of summary too &#8212; since otherwise you may feel that all I&#8217;ve really done is discuss a few examples in detail. Of course, I <em>have</em> done that, but there are several general messages embedded in the discussion, of which perhaps the most important is the amount that you can learn about a group from the possible concrete ways in which it can be defined, and in particular from the actions that can be derived from those definitions.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3811/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3811/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3811/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3811/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3811/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3811/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3811/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3811/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3811/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3811/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3811/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3811/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3811/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3811/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3811&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/11/25/group-actions-iii-whats-the-point-of-them/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Normal subgroups and quotient groups</title>
		<link>http://gowers.wordpress.com/2011/11/20/normal-subgroups-and-quotient-groups/</link>
		<comments>http://gowers.wordpress.com/2011/11/20/normal-subgroups-and-quotient-groups/#comments</comments>
		<pubDate>Sun, 20 Nov 2011 15:43:21 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[IA Groups]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3779</guid>
		<description><![CDATA[The traditional presentation of normal subgroups and quotient groups goes something like this. First, you define a subgroup to be normal if it satisfies a certain funny condition. Then, given a group and a normal subgroup , you show that you can define an operation on the cosets of , and that that operation turns [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3779&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The traditional presentation of normal subgroups and quotient groups goes something like this. First, you define a subgroup to be normal if it satisfies a certain funny condition. Then, given a group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> and a normal subgroup <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />, you show that you can define an operation on the cosets of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />, and that that operation turns the set of all cosets into a group, called the quotient group. Ideally, you also show that one can&#8217;t give a natural group structure to the left cosets of an arbitrary subgroup: that justifies restricting attention to normal subgroups.</p>
<p>There&#8217;s nothing terribly wrong with this approach, but it does leave one question unanswered: why bother with all this stuff? The traditional approach to <em>that</em> question is to ignore it, confident that the answer will gradually reveal itself. The more group theory you do, the more normal subgroups and quotients will arise naturally and demonstrate their utility, so if you just diligently keep studying, you will (fairly soon) come to regard normal subgroups and quotient groups as natural concepts that were obviously worth introducing.<br />
<span id="more-3779"></span></p>
<p>But a slight variant of the question is harder to answer: why did anybody bother to introduce these concepts in the first place? Surely the concepts were introduced for a <em>reason</em>, and not in the vague hope that they would turn out to be useful.</p>
<p>The obvious way of answering this second question is to look at the history. It turns out that normal subgroups were introduced by Galois (who was also the founder of group theory) as part of his study of the solubility of polynomials by radicals. That&#8217;s slightly unfortunate, since it means that to understand why normal subgroups were introduced, one has to put in a lot of work understanding about the theory of solving polynomials.</p>
<p>However, there is another way of justifying the introduction of a new concept into mathematics. Instead of looking at the <em>actual</em> history of that concept, one can look at a <em>fictitious</em> history. If you can tell a plausible story about why a concept <em>might have been</em> invented, then that is sufficient to make it seem reasonable. It solves the mystery of how anyone could have thought of the concept, and it also shows that it was pretty well inevitable that the concept would have been introduced sooner or later. </p>
<p>In this post, then, I&#8217;d like to give a fictitious account of why normal subgroups and quotient groups were introduced into group theory, once some of the more basic concepts were already in place.</p>
<p>The first phase of group theory (in this fictitious account) consisted in spotting that many mathematical structures, such as symmetries of Platonic solids, permutations of a finite set, non-singular <img src='http://s0.wp.com/latex.php?latex=n%5Ctimes+n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;times n' title='n&#92;times n' class='latex' /> matrices, had features in common that could be abstracted out. This led to the formulation of the axioms for group theory: associativity, identity, inverses. </p>
<p>It was noticed very early &#8212; indeed, this observation was part of what drove the initial development of the subject &#8212; that two groups could be defined very differently and yet be &#8220;basically the same&#8221;. For example, the group of rotations of a regular pentagon was basically the same as the group of integers mod 5 under addition, and the group of symmetries of a rectangle (that wasn&#8217;t a square) was basically the same as the group of transformations of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D%5E3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}^3' title='&#92;mathbb{R}^3' class='latex' /> that consisted of the identity and the three half turns about the coordinate axes. </p>
<p>The first attempts to make precise this intuition that groups could be &#8220;different but basically the same&#8221; were a little clumsy. Two groups <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> were said to be <em>identical up to permutation and relabelling</em> [please bear in mind that none of what I'm saying is true -- this definition included] if you could find orderings of the elements of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> and the elements of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> such that if you formed the multiplication tables, then they would correspond, in the sense that if the element listed in the <img src='http://s0.wp.com/latex.php?latex=r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r' title='r' class='latex' />th place times the element listed in the <img src='http://s0.wp.com/latex.php?latex=s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='s' title='s' class='latex' />th place equals the element listed in the <img src='http://s0.wp.com/latex.php?latex=t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t' title='t' class='latex' />th place in <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />, then the same is true in <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. Later, this definition was tidied up so that it became the definition of <em>isomorphism</em> that we are now familiar with: an <em>isomorphism</em> from <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is a bijection <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AG%5Cto+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:G&#92;to H' title='&#92;phi:G&#92;to H' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28xy%29%3D%5Cphi%28x%29%5Cphi%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(xy)=&#92;phi(x)&#92;phi(y)' title='&#92;phi(xy)=&#92;phi(x)&#92;phi(y)' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=x%2Cy%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x,y&#92;in G' title='x,y&#92;in G' class='latex' />. Two groups <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> were said to be <em>isomorphic</em> if there was an isomorphism between them.</p>
<p>A further important step was taken when it was observed that functions <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> that satisfy the multiplicativity property <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28xy%29%3D%5Cphi%28x%29%5Cphi%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(xy)=&#92;phi(x)&#92;phi(y)' title='&#92;phi(xy)=&#92;phi(x)&#92;phi(y)' class='latex' /> were interesting and important even if they weren&#8217;t bijections. To give just one of many examples, an extremely useful property of determinants is that det<img src='http://s0.wp.com/latex.php?latex=%28AB%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(AB)' title='(AB)' class='latex' /> is always equal to det<img src='http://s0.wp.com/latex.php?latex=%28A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(A)' title='(A)' class='latex' />det<img src='http://s0.wp.com/latex.php?latex=%28B%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(B)' title='(B)' class='latex' />. Phenomena such as this led people to study what were initially called <em>multiplicative functions</em> between groups. Later, they were renamed <em>homomorphisms</em>, to stress the similarity with isomorphisms.</p>
<p>It was noted almost immediately that if <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AG%5Cto+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:G&#92;to H' title='&#92;phi:G&#92;to H' class='latex' /> is a multiplicative function, or homomorphism as we now call it, then the set of <img src='http://s0.wp.com/latex.php?latex=g%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in G' title='g&#92;in G' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)=e' title='&#92;phi(g)=e' class='latex' /> forms a subgroup of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. Moreover, many natural subgroups occur in that way. For example, the group of rotations of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D%5E3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}^3' title='&#92;mathbb{R}^3' class='latex' /> is what you get if you take the group of all rigid motions of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D%5E3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}^3' title='&#92;mathbb{R}^3' class='latex' /> that fix the origin and restrict to the ones that have determinant equal to 1. For that and many similar reasons, the notion of the <em>kernel</em> of a homomorphism was introduced.</p>
<p>By this time, people had got a bit of a taste for the purely abstract study of group theory: it excited them that they could think about a group such as <img src='http://s0.wp.com/latex.php?latex=C_5&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_5' title='C_5' class='latex' /> without saying whether its elements were rotations of a pentagon or integers mod 5 or something else entirely. So people started studying groups <em>for their own sake</em>. And one of the first questions they asked was this: we know that kernels of homomorphisms are subgroups, but what about the converse? That is, is every subgroup the kernel of some homomorphism?</p>
<p>This problem turned out not to be very interesting, since there were easy counterexamples. For example, if you take the permutation group <img src='http://s0.wp.com/latex.php?latex=S_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_3' title='S_3' class='latex' /> and take the subgroup that consists of the identity and the transposition <img src='http://s0.wp.com/latex.php?latex=%2812%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)' title='(12)' class='latex' />, then that is not the kernel of any homomorphism. The original proof of this fact went something like this. Suppose we know that <img src='http://s0.wp.com/latex.php?latex=%2812%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)' title='(12)' class='latex' /> goes to the identity. We can use that to deduce that, say, <img src='http://s0.wp.com/latex.php?latex=%2823%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(23)' title='(23)' class='latex' /> also goes to the identity. We do this by &#8220;using <img src='http://s0.wp.com/latex.php?latex=%2812%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)' title='(12)' class='latex' /> to do <img src='http://s0.wp.com/latex.php?latex=%2823%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(23)' title='(23)' class='latex' />&#8221; as follows. We first find a permutation that switches <img src='http://s0.wp.com/latex.php?latex=1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1' title='1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3' title='3' class='latex' /> &#8212; the obvious one being <img src='http://s0.wp.com/latex.php?latex=%2813%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(13)' title='(13)' class='latex' />. We then switch <img src='http://s0.wp.com/latex.php?latex=1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1' title='1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3' title='3' class='latex' />, perform the permutation <img src='http://s0.wp.com/latex.php?latex=%2812%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)' title='(12)' class='latex' /> and finally switch <img src='http://s0.wp.com/latex.php?latex=1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1' title='1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3' title='3' class='latex' /> back again. More formally, we calculate the permutation <img src='http://s0.wp.com/latex.php?latex=%2813%29%2812%29%2813%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(13)(12)(13)' title='(13)(12)(13)' class='latex' />, and we note that because we began and ended by swapping <img src='http://s0.wp.com/latex.php?latex=1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1' title='1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3' title='3' class='latex' /> round, this new permutation does to <img src='http://s0.wp.com/latex.php?latex=3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3' title='3' class='latex' /> what the old one did to <img src='http://s0.wp.com/latex.php?latex=1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1' title='1' class='latex' /> and vice versa. So the resulting permutation is <img src='http://s0.wp.com/latex.php?latex=%2832%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(32)' title='(32)' class='latex' /> instead of <img src='http://s0.wp.com/latex.php?latex=%2812%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)' title='(12)' class='latex' /> &#8212; and <img src='http://s0.wp.com/latex.php?latex=%2832%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(32)' title='(32)' class='latex' /> is the same as <img src='http://s0.wp.com/latex.php?latex=%2823%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(23)' title='(23)' class='latex' />. </p>
<p>What does that prove? Well, if we now think about what <img src='http://s0.wp.com/latex.php?latex=%5Cphi%2823%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(23)' title='&#92;phi(23)' class='latex' /> must be, it is the result of multiplying <img src='http://s0.wp.com/latex.php?latex=%5Cphi%2813%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(13)' title='&#92;phi(13)' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=%5Cphi%2812%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(12)' title='&#92;phi(12)' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=%5Cphi%2813%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(13)' title='&#92;phi(13)' class='latex' />. But since <img src='http://s0.wp.com/latex.php?latex=%5Cphi%2812%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(12)' title='&#92;phi(12)' class='latex' /> is the identity and <img src='http://s0.wp.com/latex.php?latex=%2813%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(13)' title='(13)' class='latex' /> is its own inverse, we end up just doing <img src='http://s0.wp.com/latex.php?latex=%5Cphi%2813%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(13)' title='&#92;phi(13)' class='latex' /> and inverting it again. This shows that <img src='http://s0.wp.com/latex.php?latex=%2823%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(23)' title='(23)' class='latex' /> also belongs to the kernel.</p>
<p>It was soon realized that this basic trick could be used to generate lots of counterexamples. All you had to do was find a group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />, a subgroup <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />, an element <img src='http://s0.wp.com/latex.php?latex=h%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in H' title='h&#92;in H' class='latex' /> and an element <img src='http://s0.wp.com/latex.php?latex=g%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in G' title='g&#92;in G' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=ghg%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ghg^{-1}' title='ghg^{-1}' class='latex' /> was not in <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. And that was easy to do.</p>
<p>Soon this observation was turned into a formal definition. A subgroup <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> was said to be <em>closed under conjugation</em> if <img src='http://s0.wp.com/latex.php?latex=ghg%5E%7B-1%7D%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ghg^{-1}&#92;in H' title='ghg^{-1}&#92;in H' class='latex' /> whenever <img src='http://s0.wp.com/latex.php?latex=h%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in H' title='h&#92;in H' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in G' title='g&#92;in G' class='latex' />. And now the original question was modified in an obvious way. The argument used to generate counterexamples could be encapsulated in the following statement: the kernel of a homomorphism must always be closed under conjugation. (Proof: if <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28h%29%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(h)=e' title='&#92;phi(h)=e' class='latex' />, then </p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cphi%28ghg%5E%7B-1%7D%29%3D%5Cphi%28g%29%5Cphi%28h%29%5Cphi%28g%5E%7B-1%7D%29%3D%5Cphi%28g%29%5Cphi%28g%5E%7B-1%7D%29%3D%5Cphi%28e%29%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(ghg^{-1})=&#92;phi(g)&#92;phi(h)&#92;phi(g^{-1})=&#92;phi(g)&#92;phi(g^{-1})=&#92;phi(e)=e' title='&#92;phi(ghg^{-1})=&#92;phi(g)&#92;phi(h)&#92;phi(g^{-1})=&#92;phi(g)&#92;phi(g^{-1})=&#92;phi(e)=e' class='latex' />.) So what about the converse? Is every subgroup that is closed under conjugation the kernel of some homomorphism?</p>
<p>This question is an example of a phenomenon that occurs frequently in mathematics. You find a very simple <em>necessary</em> condition for something to be the case. (In this example, being closed under conjugation is a necessary condition for a subgroup to be the kernel of a homomorphism.) You then attempt to show that that condition is sufficient. It&#8217;s always a pleasant surprise when you manage, since it is far from obvious in advance that the conditions you have identified are the only obstacles to getting what you want. </p>
<p>Here, for instance, it isn&#8217;t obvious at all that just because a subgroup satisfies a condition that it clearly has to satisfy, it must be the kernel of some homomorphism. Where&#8217;s that homomorphism going to come from? There isn&#8217;t another group around, let alone a map from <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> to that group.</p>
<p>This problem was found quite hard at the time, though the technique used to solve it has since become standard. The original thought process that led to a solution went something like this.</p>
<li><em>If you are trying to find something complicated but have no idea where to start, then pretend you&#8217;ve found what you are looking for and see what you can say about it.</em></li>
<p>I think of this method as a huge generalization of what we do when we solve equations. If someone says, &#8220;Find a number that gives you 128 when you multiply it by itself and add 7,&#8221; you might (if you had only just started algebra) think as follows. &#8220;I&#8217;ll pretend I&#8217;ve got the number, and I&#8217;ll call it <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />. The property <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> has is that <img src='http://s0.wp.com/latex.php?latex=x%5E2%2B7%3D128&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x^2+7=128' title='x^2+7=128' class='latex' />. That tells me that <img src='http://s0.wp.com/latex.php?latex=x%5E2%3D121&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x^2=121' title='x^2=121' class='latex' />, which in turn tells me that <img src='http://s0.wp.com/latex.php?latex=x%3D%5Cpm+11&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=&#92;pm 11' title='x=&#92;pm 11' class='latex' />. So let me see whether one of those works. Yes, it does! I can take <img src='http://s0.wp.com/latex.php?latex=x%3D11&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=11' title='x=11' class='latex' /> and then I get <img src='http://s0.wp.com/latex.php?latex=121%2B7%3D128&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='121+7=128' title='121+7=128' class='latex' />.&#8221;</p>
<p>Let&#8217;s try something similar here. I&#8217;ve got a group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> and a subgroup <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> that&#8217;s closed under conjugation. Let&#8217;s pretend we have a homomorphism <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> from <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> to some group <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' /> and that the kernel of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. What can we say about <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />?</p>
<p>Well, one thing we know immediately is that <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28h%29%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(h)=e' title='&#92;phi(h)=e' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=h%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in H' title='h&#92;in H' class='latex' />, since that was our initial assumption. What can we do with that information? One thing it tells us is that <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%3D%5Cphi%28gh%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)=&#92;phi(gh)' title='&#92;phi(g)=&#92;phi(gh)' class='latex' /> whenever <img src='http://s0.wp.com/latex.php?latex=g%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in G' title='g&#92;in G' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=h%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in H' title='h&#92;in H' class='latex' />, and also that <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%3D%5Cphi%28hg%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)=&#92;phi(hg)' title='&#92;phi(g)=&#92;phi(hg)' class='latex' />. So that tells us that <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%3D%5Cphi%28g%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)=&#92;phi(g&#039;)' title='&#92;phi(g)=&#92;phi(g&#039;)' class='latex' /> whenever we can find <img src='http://s0.wp.com/latex.php?latex=h%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in H' title='h&#92;in H' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=g%27%3Dhg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#039;=hg' title='g&#039;=hg' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=g%27%3Dgh&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#039;=gh' title='g&#039;=gh' class='latex' />. </p>
<p>Let&#8217;s explore that a little more. Let <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> be an element of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. Can we say precisely for which elements <img src='http://s0.wp.com/latex.php?latex=g%27%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#039;&#92;in G' title='g&#039;&#92;in G' class='latex' /> it <em>must</em> be the case that <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%3D%5Cphi%28g%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)=&#92;phi(g&#039;)' title='&#92;phi(g)=&#92;phi(g&#039;)' class='latex' />? We&#8217;ve shown that it must when <img src='http://s0.wp.com/latex.php?latex=g%27%3Dhg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#039;=hg' title='g&#039;=hg' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=gh&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gh' title='gh' class='latex' /> for some <img src='http://s0.wp.com/latex.php?latex=h%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in H' title='h&#92;in H' class='latex' />. To prove that, we used the fact that every <img src='http://s0.wp.com/latex.php?latex=h%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in H' title='h&#92;in H' class='latex' /> belongs to the kernel of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />. But we also know the converse: that every element of the kernel of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> belongs to <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. So if <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%3D%5Cphi%28g%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)=&#92;phi(g&#039;)' title='&#92;phi(g)=&#92;phi(g&#039;)' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=g%5E%7B-1%7Dg%27%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g^{-1}g&#039;&#92;in H' title='g^{-1}g&#039;&#92;in H' class='latex' />, since <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%5E%7B-1%7Dg%27%29%3D%5Cphi%28g%29%5E%7B-1%7D%5Cphi%28g%27%29%3D%5Cphi%28g%29%5E%7B-1%7D%5Cphi%28g%29%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g^{-1}g&#039;)=&#92;phi(g)^{-1}&#92;phi(g&#039;)=&#92;phi(g)^{-1}&#92;phi(g)=e' title='&#92;phi(g^{-1}g&#039;)=&#92;phi(g)^{-1}&#92;phi(g&#039;)=&#92;phi(g)^{-1}&#92;phi(g)=e' class='latex' />. </p>
<p>What we have just established, using only the information that <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is the kernel of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />, is that <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%3D%5Cphi%28g%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)=&#92;phi(g&#039;)' title='&#92;phi(g)=&#92;phi(g&#039;)' class='latex' /> if and only if <img src='http://s0.wp.com/latex.php?latex=g%5E%7B-1%7Dg%27%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g^{-1}g&#039;&#92;in H' title='g^{-1}g&#039;&#92;in H' class='latex' />, which is the same as saying that <img src='http://s0.wp.com/latex.php?latex=g%27%5Cin+gH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#039;&#92;in gH' title='g&#039;&#92;in gH' class='latex' />. In other words, <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is constant on the left cosets of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> but takes different values on different left cosets.</p>
<p>But I could have argued slightly differently. I could have shown that <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%3D%5Cphi%28g%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)=&#92;phi(g&#039;)' title='&#92;phi(g)=&#92;phi(g&#039;)' class='latex' /> implies that <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%27g%5E%7B-1%7D%29%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g&#039;g^{-1})=e' title='&#92;phi(g&#039;g^{-1})=e' class='latex' />, and therefore concluded that <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%3D%5Cphi%28g%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)=&#92;phi(g&#039;)' title='&#92;phi(g)=&#92;phi(g&#039;)' class='latex' /> if and only if <img src='http://s0.wp.com/latex.php?latex=g%27g%5E%7B-1%7D%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#039;g^{-1}&#92;in H' title='g&#039;g^{-1}&#92;in H' class='latex' />, or <img src='http://s0.wp.com/latex.php?latex=g%27%5Cin+Hg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#039;&#92;in Hg' title='g&#039;&#92;in Hg' class='latex' />. This would have shown that <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is constant on the <em>right</em> cosets of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />, and that it takes different values on different right cosets.</p>
<p>These two observations are incompatible with each other unless every left coset is a right coset and vice versa. So we&#8217;d better check that. What right coset of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> might be equal to <img src='http://s0.wp.com/latex.php?latex=gH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gH' title='gH' class='latex' />? Well, it had better contain <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' />, so <img src='http://s0.wp.com/latex.php?latex=H+g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H g' title='H g' class='latex' /> is a pretty obvious choice. Does <img src='http://s0.wp.com/latex.php?latex=gH%3DHg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gH=Hg' title='gH=Hg' class='latex' />? The answer is yes if and only if <img src='http://s0.wp.com/latex.php?latex=gHg%5E%7B-1%7D%3DH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gHg^{-1}=H' title='gHg^{-1}=H' class='latex' />. But <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is closed under conjugation, so we&#8217;re OK. [By the way, if you are anxious about my writing equations that involve not just elements of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> but subgroups of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> and then doing things like multiplying both sides on the right by <img src='http://s0.wp.com/latex.php?latex=g%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g^{-1}' title='g^{-1}' class='latex' />, then you have good instincts. The reasoning is valid, but it is important to check that it is valid. I'll leave that as an exercise if you haven't done it already.]</p>
<p>Where have we got to now? We have shown that <em>if</em> <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is a homomorphism from <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> to a group <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' />, and <em>if</em> the kernel of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> must be constant on the cosets of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> &#8212; and we have also shown that I&#8217;m allowed to say &#8220;cosets&#8221; because the left and right cosets coincide. Also, <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> must take different values on different cosets.</p>
<p>Is that all we can say? Very much not. If we have an ounce of mathematical curiosity, then sooner or later we will start to wonder whether we can say anything about how the values of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> on different cosets are related to each other. If we know that <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> takes the value <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> everywhere on the coset <img src='http://s0.wp.com/latex.php?latex=g_1H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1H' title='g_1H' class='latex' /> and the value <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> everywhere on the coset <img src='http://s0.wp.com/latex.php?latex=g_2H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_2H' title='g_2H' class='latex' />, can we deduce anything from that? Well, the main thing we know about <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is that it is a homomorphism, so let&#8217;s try to use that fact. If <img src='http://s0.wp.com/latex.php?latex=h_1%2Ch_2%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h_1,h_2&#92;in H' title='h_1,h_2&#92;in H' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g_1h_1%29%3Da&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g_1h_1)=a' title='&#92;phi(g_1h_1)=a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g_2h_2%29%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g_2h_2)=b' title='&#92;phi(g_2h_2)=b' class='latex' />, so <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g_1h_1g_2h_2%29%3Dab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g_1h_1g_2h_2)=ab' title='&#92;phi(g_1h_1g_2h_2)=ab' class='latex' />, by the multiplicativity property. By what we have just established, that tells us that <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> will take the value <img src='http://s0.wp.com/latex.php?latex=ab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab' title='ab' class='latex' /> on the entire coset to which <img src='http://s0.wp.com/latex.php?latex=g_1h_1g_2h_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1h_1g_2h_2' title='g_1h_1g_2h_2' class='latex' /> belongs. But what is that coset? To answer that we would like to rewrite <img src='http://s0.wp.com/latex.php?latex=g_1h_1g_2h_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1h_1g_2h_2' title='g_1h_1g_2h_2' class='latex' /> as a product that begins with something in <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> and ends with something in <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. It would be nice if we could let <img src='http://s0.wp.com/latex.php?latex=h_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h_1' title='h_1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_2' title='g_2' class='latex' /> swap places. </p>
<p>Can we say that <img src='http://s0.wp.com/latex.php?latex=h_1g_2%3Dg_2h_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h_1g_2=g_2h_1' title='h_1g_2=g_2h_1' class='latex' />? Unfortunately not. But let&#8217;s play around a little. We know that <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is closed under conjugation, so we might try to find a conjugation. And we can! Rearranging the equation we were hoping for gives us that <img src='http://s0.wp.com/latex.php?latex=g_2%5E%7B-1%7Dh_1g_2%3Dh_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_2^{-1}h_1g_2=h_1' title='g_2^{-1}h_1g_2=h_1' class='latex' />. There is no reason to suppose that that is true, but we do at least know that the right hand side belongs to <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. So we can at least write <img src='http://s0.wp.com/latex.php?latex=g_2%5E%7B-1%7Dh_1g_2%3Dh_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_2^{-1}h_1g_2=h_3' title='g_2^{-1}h_1g_2=h_3' class='latex' />. And rearranging that tells us that <img src='http://s0.wp.com/latex.php?latex=g_2h_3%3Dh_1g_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_2h_3=h_1g_2' title='g_2h_3=h_1g_2' class='latex' />. So <img src='http://s0.wp.com/latex.php?latex=g_1h_1g_2h_2%3Dg_1g_2h_3h_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1h_1g_2h_2=g_1g_2h_3h_2' title='g_1h_1g_2h_2=g_1g_2h_3h_2' class='latex' />. That tells us that the coset that contains <img src='http://s0.wp.com/latex.php?latex=g_1h_1g_2h_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1h_1g_2h_2' title='g_1h_1g_2h_2' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=g_1g_2H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1g_2H' title='g_1g_2H' class='latex' />, which is as nice an answer as we could have hoped for.</p>
<p>What does it tell us? Let&#8217;s write <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28gH%29%3Da&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(gH)=a' title='&#92;phi(gH)=a' class='latex' /> to mean that <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> takes the value <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> everywhere on the coset <img src='http://s0.wp.com/latex.php?latex=gH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gH' title='gH' class='latex' />. Then we have shown that if <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g_1H%29%3Da_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g_1H)=a_1' title='&#92;phi(g_1H)=a_1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g_2H%29%3Da_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g_2H)=a_2' title='&#92;phi(g_2H)=a_2' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g_1g_2H%29%3Da_1a_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g_1g_2H)=a_1a_2' title='&#92;phi(g_1g_2H)=a_1a_2' class='latex' />. </p>
<p>OK, we&#8217;ve found all sorts of properties that <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> must have, but are we any nearer to finding <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />? Yes we are, in that losing-at-chess sense that our options are getting more and more restricted, which makes it easier to work out what to do. Here&#8217;s a trick that reduces our options still further. One of the observations we have made is that <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is constant on every coset and takes different values on different cosets. That means that there is a bijection between the cosets of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> and the elements of the image of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />. </p>
<p>How does that help when we don&#8217;t actually know what <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is, or what its image is? It actually helps a lot. We are free to <em>define</em> the image. Can we think of a set that&#8217;s in one-to-one correspondence with the set of all cosets of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />? Yes of course we can: just take the set itself! </p>
<p>But hang on, you might say, isn&#8217;t that a bit dangerous? There are lots of sets that are in one-to-one correspondence with any given set, so what reason is there to think that the set itself is a good choice? Well, here are two reasons.</p>
<p>(i) We are given absolutely no data in the problem other than the group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> and the subgroup <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />, so it is highly likely that the homomorphism <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> and the group <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' /> that <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> maps to will be built out of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> in some way.</p>
<p>(ii) In a sense it doesn&#8217;t actually matter what set we define the group operation on, since if we define it on a set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cbeta%3AX%5Cto+Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;beta:X&#92;to Y' title='&#92;beta:X&#92;to Y' class='latex' /> is a bijection, then we can use the group operation on <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> to define essentially the same group operation on <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=y_1%5Ccirc+y_2%3D%5Cbeta%28%5Cbeta%5E%7B-1%7D%28y_1%29%5Ccirc+%5Cbeta%5E%7B-1%7D%28y_2%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y_1&#92;circ y_2=&#92;beta(&#92;beta^{-1}(y_1)&#92;circ &#92;beta^{-1}(y_2))' title='y_1&#92;circ y_2=&#92;beta(&#92;beta^{-1}(y_1)&#92;circ &#92;beta^{-1}(y_2))' class='latex' />.</p>
<p>So now we&#8217;ve managed to cut things down further. We want to define a binary operation on the set of all cosets of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> that will make it into a group, and we want the function <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> that takes <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> to the coset <img src='http://s0.wp.com/latex.php?latex=gH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gH' title='gH' class='latex' /> to be a homomorphism.</p>
<p>Now let&#8217;s go back to what we have managed to establish about <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />. An important property was that <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g_1g_2H%29%3D%5Cphi%28g_1H%29%5Cphi%28g_2H%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g_1g_2H)=&#92;phi(g_1H)&#92;phi(g_2H)' title='&#92;phi(g_1g_2H)=&#92;phi(g_1H)&#92;phi(g_2H)' class='latex' /> (where <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28gH%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(gH)' title='&#92;phi(gH)' class='latex' /> meant the constant value that <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> takes on the coset <img src='http://s0.wp.com/latex.php?latex=gH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gH' title='gH' class='latex' />). But now we&#8217;ve decided that we&#8217;re going to define <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28gH%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(gH)' title='&#92;phi(gH)' class='latex' /> to be <img src='http://s0.wp.com/latex.php?latex=gH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gH' title='gH' class='latex' /> itself. The only thing that we haven&#8217;t decided is what the binary operation on the set of all cosets should be. But what we established earlier about <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> forces our hand completely. Since <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g_1H%29%5Cphi%28g_2H%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g_1H)&#92;phi(g_2H)' title='&#92;phi(g_1H)&#92;phi(g_2H)' class='latex' /> must equal <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g_1g_2H%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g_1g_2H)' title='&#92;phi(g_1g_2H)' class='latex' /> and since <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28gH%29%3DgH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(gH)=gH' title='&#92;phi(gH)=gH' class='latex' />, it follows that <img src='http://s0.wp.com/latex.php?latex=%28g_1H%29%5Ccirc%28g_2H%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(g_1H)&#92;circ(g_2H)' title='(g_1H)&#92;circ(g_2H)' class='latex' /> must be <img src='http://s0.wp.com/latex.php?latex=g_1g_2H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1g_2H' title='g_1g_2H' class='latex' />. We have arrived at the definition of the quotient group and the quotient map, and thereby solved the problem.</p>
<hr />
<p>Usually when the quotient group is defined, one defines the binary operation on the set of cosets and then checks that it is well-defined. In the course of the above thoughts, we have basically already checked this. </p>
<p>In my fictitious world, there was one final stage in the early development of group theory, which was that all the thoughts that led to the definition of the quotient group were carefully suppressed. The solution to the kernels classification problem was presented like this. </p>
<p><strong>Theorem.</strong> <em>A subgroup <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> of a group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is the kernel of some homomorphism if and only if it is closed under conjugation.</em></p>
<p><strong>Proof.</strong> First we show that the condition is necessary. Let <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AG%5Cto+K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:G&#92;to K' title='&#92;phi:G&#92;to K' class='latex' /> be a homomorphism with kernel <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. Then for every <img src='http://s0.wp.com/latex.php?latex=h%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in H' title='h&#92;in H' class='latex' /> and every <img src='http://s0.wp.com/latex.php?latex=g%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in G' title='g&#92;in G' class='latex' /> we have<br />
<img src='http://s0.wp.com/latex.php?latex=%5Cphi%28ghg%5E%7B-1%7D%29%3D%5Cphi%28g%29%5Cphi%28h%29%5Cphi%28g%29%5E%7B-1%7D%3D%5Cphi%28g%29%5Cphi%28g%29%5E%7B-1%7D%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(ghg^{-1})=&#92;phi(g)&#92;phi(h)&#92;phi(g)^{-1}=&#92;phi(g)&#92;phi(g)^{-1}=e' title='&#92;phi(ghg^{-1})=&#92;phi(g)&#92;phi(h)&#92;phi(g)^{-1}=&#92;phi(g)&#92;phi(g)^{-1}=e' class='latex' />,<br />
which proves that <img src='http://s0.wp.com/latex.php?latex=ghg%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ghg^{-1}' title='ghg^{-1}' class='latex' /> also belongs to the kernel of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />, and thus also to <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. That proves that <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is closed under conjugation.</p>
<p>Now suppose that <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is closed under conjugation. Define a group <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' /> as follows. Its elements are the left cosets of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> (which, it can be shown, are also the right cosets of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />). We define a binary operation on these cosets by taking <img src='http://s0.wp.com/latex.php?latex=%28g_1H%29%5Ccirc%28g_2H%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(g_1H)&#92;circ(g_2H)' title='(g_1H)&#92;circ(g_2H)' class='latex' /> to equal <img src='http://s0.wp.com/latex.php?latex=g_1g_2H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1g_2H' title='g_1g_2H' class='latex' />. </p>
<p>There are a few things we must check. First, we must make sure that the definition we have just given does not change if we choose different elements <img src='http://s0.wp.com/latex.php?latex=g_1%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1&#039;' title='g_1&#039;' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g_2%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_2&#039;' title='g_2&#039;' class='latex' /> of the cosets <img src='http://s0.wp.com/latex.php?latex=g_1H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1H' title='g_1H' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g_2H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_2H' title='g_2H' class='latex' />. A quick way of doing that is to note that a different way of defining <img src='http://s0.wp.com/latex.php?latex=%28g_1H%29%5Ccirc%28g_2H%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(g_1H)&#92;circ(g_2H)' title='(g_1H)&#92;circ(g_2H)' class='latex' /> is as the set of all <img src='http://s0.wp.com/latex.php?latex=xy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='xy' title='xy' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=x%5Cin+g_1H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in g_1H' title='x&#92;in g_1H' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y%5Cin+g_2H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in g_2H' title='y&#92;in g_2H' class='latex' />. That definition clearly depends on the cosets themselves and not on how they are described, but does it give us <img src='http://s0.wp.com/latex.php?latex=g_1g_2H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1g_2H' title='g_1g_2H' class='latex' />? Well, it certainly contains <img src='http://s0.wp.com/latex.php?latex=g_1g_2H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1g_2H' title='g_1g_2H' class='latex' />. In the other direction, <img src='http://s0.wp.com/latex.php?latex=g_1h_1g_2h_2%3Dg_1g_2%28g_2%5E%7B-1%7Dh_1g_2%29h_2%3Dg_1g_2h_3h_2%5Cin+g_1g_2H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1h_1g_2h_2=g_1g_2(g_2^{-1}h_1g_2)h_2=g_1g_2h_3h_2&#92;in g_1g_2H' title='g_1h_1g_2h_2=g_1g_2(g_2^{-1}h_1g_2)h_2=g_1g_2h_3h_2&#92;in g_1g_2H' class='latex' />, so it is also contained in <img src='http://s0.wp.com/latex.php?latex=g_1g_2H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1g_2H' title='g_1g_2H' class='latex' />. </p>
<p>Now let us define a function <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AG%5Cto+K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:G&#92;to K' title='&#92;phi:G&#92;to K' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%3DgH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)=gH' title='&#92;phi(g)=gH' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g_1g_2%29%3Dg_1g_2H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g_1g_2)=g_1g_2H' title='&#92;phi(g_1g_2)=g_1g_2H' class='latex' />, which, by definition of the group operation on <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' />, is equal to <img src='http://s0.wp.com/latex.php?latex=g_1H%5Ccirc+g_2H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1H&#92;circ g_2H' title='g_1H&#92;circ g_2H' class='latex' />. Therefore, <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is a homomorphism. For <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%3DgH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)=gH' title='&#92;phi(g)=gH' class='latex' /> to equal <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> we need <img src='http://s0.wp.com/latex.php?latex=g%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in H' title='g&#92;in H' class='latex' />, so the kernel of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />, as required. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p>Unfortunately, each year a few of the brighter mathematics students found this argument reasonably easy to digest. So the decision was taken to suppress all mention of why the group <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' /> was constructed. Instead, it became common practice to define the <em>quotient group</em> <img src='http://s0.wp.com/latex.php?latex=G%2FH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G/H' title='G/H' class='latex' /> for no apparent reason and to point out only later that the function <img src='http://s0.wp.com/latex.php?latex=g%5Cmapsto+gH&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;mapsto gH' title='g&#92;mapsto gH' class='latex' />, known as the <em>quotient map</em> was a homomorphism with kernel <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. Now, at last, the goal of making the concept difficult for everybody had been triumphantly achieved.</p>
<p>A further development was the realization that the method that had been arrived at was very general indeed. After this proof, different notions of &#8220;quotient&#8221; kept appearing all over mathematics, and in an effort to find a unified description of them, the notion of an equivalence relation was formulated. But that is another story (perhaps to be presented in another post).</p>
<hr />
<p>As an afterthought, here&#8217;s a different way of presenting the proof, which uses equivalence relations instead of partitioning into cosets. I&#8217;ll switch to using the phrase &#8220;normal subgroup&#8221; rather than talking about subgroups being closed under conjugation.</p>
<p><strong>Theorem.</strong> <em>Let <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> be a normal subgroup of a group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. Then there is a group <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' /> and a homomorphism <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AG%5Cto+K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:G&#92;to K' title='&#92;phi:G&#92;to K' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is the kernel of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />.</em></p>
<p><strong>Proof.</strong> Define two elements <img src='http://s0.wp.com/latex.php?latex=g_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1' title='g_1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_2' title='g_2' class='latex' /> to be <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />-<em>equivalent</em>, and write <img src='http://s0.wp.com/latex.php?latex=g_1%5Csim+g_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1&#92;sim g_2' title='g_1&#92;sim g_2' class='latex' />, if there exists <img src='http://s0.wp.com/latex.php?latex=h%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in H' title='h&#92;in H' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=g_2%3Dg_1h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_2=g_1h' title='g_2=g_1h' class='latex' />. It is easy to check that this is an equivalence relation. </p>
<p>Now we shall prove that if <img src='http://s0.wp.com/latex.php?latex=g_1%5Csim+g_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1&#92;sim g_3' title='g_1&#92;sim g_3' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g_2%5Csim+g_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_2&#92;sim g_4' title='g_2&#92;sim g_4' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=g_1g_2%5Csim+g_3g_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1g_2&#92;sim g_3g_4' title='g_1g_2&#92;sim g_3g_4' class='latex' />. Let <img src='http://s0.wp.com/latex.php?latex=h_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h_1' title='h_1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=h_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h_2' title='h_2' class='latex' /> be such that <img src='http://s0.wp.com/latex.php?latex=g_3%3Dg_1h_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_3=g_1h_1' title='g_3=g_1h_1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g_4%3Dg_2h_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_4=g_2h_2' title='g_4=g_2h_2' class='latex' />. Then<br />
<img src='http://s0.wp.com/latex.php?latex=g_3g_4%3Dg_1h_1g_2h_2%3Dg_1g_2%28g_2%5E%7B-1%7Dh_1g_2%29h_2%3Dg_1g_2h_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_3g_4=g_1h_1g_2h_2=g_1g_2(g_2^{-1}h_1g_2)h_2=g_1g_2h_3' title='g_3g_4=g_1h_1g_2h_2=g_1g_2(g_2^{-1}h_1g_2)h_2=g_1g_2h_3' class='latex' />,<br />
where <img src='http://s0.wp.com/latex.php?latex=h_3%3D%28g_2%5E%7B-1%7Dh_1g_2%29h_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h_3=(g_2^{-1}h_1g_2)h_2' title='h_3=(g_2^{-1}h_1g_2)h_2' class='latex' /> belongs to <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />, since <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is a normal subgroup.</p>
<p>It follows that we can define a group operation on the equivalence classes of <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' />: if <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> are two equivalence classes, then all products <img src='http://s0.wp.com/latex.php?latex=g_1g_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1g_2' title='g_1g_2' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=g_1%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1&#92;in X' title='g_1&#92;in X' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g_2%5Cin+Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_2&#92;in Y' title='g_2&#92;in Y' class='latex' /> are equivalent, and we let that equivalence class be the product of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' />. </p>
<p>The group axioms for this operation basically follow if we replace &#8220;<img src='http://s0.wp.com/latex.php?latex=%3D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='=' title='=' class='latex' />&#8221; by &#8220;<img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' />&#8221; in the usual group axioms. For instance, given any three elements <img src='http://s0.wp.com/latex.php?latex=g_1%2Cg_2%2Cg_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1,g_2,g_3' title='g_1,g_2,g_3' class='latex' /> of the group, we know that <img src='http://s0.wp.com/latex.php?latex=g_1%28g_2g_3%29%3D%28g_1g_2%29g_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1(g_2g_3)=(g_1g_2)g_3' title='g_1(g_2g_3)=(g_1g_2)g_3' class='latex' />. It follows that <img src='http://s0.wp.com/latex.php?latex=g_1%28g_2g_3%29%5Csim%28g_1g_2%29g_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g_1(g_2g_3)&#92;sim(g_1g_2)g_3' title='g_1(g_2g_3)&#92;sim(g_1g_2)g_3' class='latex' />, and from that and the definition of the product it follows that <img src='http://s0.wp.com/latex.php?latex=%5Bg_1%5D%28%5Bg_2%5D%5Bg_3%5D%29%3D%28%5Bg_1%5D%5Bg_2%5D%29%5Bg_3%5D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='[g_1]([g_2][g_3])=([g_1][g_2])[g_3]' title='[g_1]([g_2][g_3])=([g_1][g_2])[g_3]' class='latex' /> (where <img src='http://s0.wp.com/latex.php?latex=%5Bg%5D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='[g]' title='[g]' class='latex' /> stands for the equivalence class of <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' />). And a similar argument shows that the map that takes <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Bg%5D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='[g]' title='[g]' class='latex' /> for each <img src='http://s0.wp.com/latex.php?latex=g%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in G' title='g&#92;in G' class='latex' /> is a homomorphism. The kernel of this map is <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3779/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3779/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3779/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3779/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3779/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3779/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3779/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3779/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3779/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3779/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3779/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3779/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3779/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3779/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3779&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/11/20/normal-subgroups-and-quotient-groups/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Proving the fundamental theorem of arithmetic</title>
		<link>http://gowers.wordpress.com/2011/11/18/proving-the-fundamental-theorem-of-arithmetic/</link>
		<comments>http://gowers.wordpress.com/2011/11/18/proving-the-fundamental-theorem-of-arithmetic/#comments</comments>
		<pubDate>Fri, 18 Nov 2011 23:47:16 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[IA Numbers and Sets]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3753</guid>
		<description><![CDATA[How much of the standard proof of the fundamental theorem of arithmetic follows from general tricks that can be applied all over the place and how much do you actually have to remember? At first it may seem as though you have to remember quite a bit: there is a non-obvious sequence of lemmas, starting [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3753&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>How much of the standard proof of the fundamental theorem of arithmetic follows from general tricks that can be applied all over the place and how much do you actually have to remember? At first it may seem as though you have to remember quite a bit: there is a non-obvious sequence of lemmas, starting with B&eacute;zout&#8217;s theorem, continuing with the clever proof that if <img src='http://s0.wp.com/latex.php?latex=p%7Cab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|ab' title='p|ab' class='latex' /> then either <img src='http://s0.wp.com/latex.php?latex=p%7Ca&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|a' title='p|a' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=p%7Cb&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|b' title='p|b' class='latex' />, bumping that up to a proof for bigger products, and eventually deducing the theorem itself. </p>
<p>But what if one were simply asked to come up with a proof? Would there be any chance of discovering that sequence of lemmas? I maintain that there would &#8212; if, that is, you are aware of certain general tricks.<br />
<span id="more-3753"></span></p>
<p>Let&#8217;s imagine, then, that we don&#8217;t know the proof and are trying to work it out. I&#8217;ll split the whole process up into a number of steps. I&#8217;ll precede the description of each step by a slogan that more or less generates the argument.</p>
<p>1. <em>State the problem carefully and give names to things.</em></p>
<p>Loosely speaking, the theorem states that every positive integer can be factorized in exactly one way. Almost always, if you want to prove that something happens exactly once, then you need to show that it happens at least once and that it happens at most once. So in this case, when we state the problem carefully we end up splitting it into two claims, as follows.</p>
<p><strong>Claim 1.</strong> Let <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> be a positive integer. Then there exists a sequence <img src='http://s0.wp.com/latex.php?latex=%28p_1%2C%5Cdots%2Cp_r%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(p_1,&#92;dots,p_r)' title='(p_1,&#92;dots,p_r)' class='latex' /> of prime numbers such that <img src='http://s0.wp.com/latex.php?latex=p_1p_2%5Cdots+p_r%3Dn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1p_2&#92;dots p_r=n' title='p_1p_2&#92;dots p_r=n' class='latex' />. </p>
<p><strong>Claim 2.</strong> Let <img src='http://s0.wp.com/latex.php?latex=%28p_1%2C%5Cdots%2Cp_r%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(p_1,&#92;dots,p_r)' title='(p_1,&#92;dots,p_r)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%28q_1%2C%5Cdots%2Cq_s%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(q_1,&#92;dots,q_s)' title='(q_1,&#92;dots,q_s)' class='latex' /> be two sequences of primes and suppose that <img src='http://s0.wp.com/latex.php?latex=p_1%5Cdots+p_r%3Dq_1%5Cdots+q_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1&#92;dots p_r=q_1&#92;dots q_s' title='p_1&#92;dots p_r=q_1&#92;dots q_s' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=r%3Ds&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r=s' title='r=s' class='latex' /> and there is a permutation <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cr%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,r&#92;}' title='&#92;{1,2,&#92;dots,r&#92;}' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=q_i%3Dp_%7B%5Csigma%28i%29%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_i=p_{&#92;sigma(i)}' title='q_i=p_{&#92;sigma(i)}' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=i%5Cin%5C%7B1%2C2%2C%5Cdots%2Cr%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='i&#92;in&#92;{1,2,&#92;dots,r&#92;}' title='i&#92;in&#92;{1,2,&#92;dots,r&#92;}' class='latex' />. </p>
<p>There was some choice about how I gave the formal statements of those two claims &#8212; if you think that you wouldn&#8217;t have stated Claim 2 in the way that I did, then that&#8217;s absolutely fine. Just to emphasize this point, here&#8217;s another way of stating Claim 2. (And there are many more.)</p>
<p><strong>Claim 2.</strong> Let <img src='http://s0.wp.com/latex.php?latex=p_1%5Cleq+p_2%5Cleq%5Cdots%5Cleq+p_r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1&#92;leq p_2&#92;leq&#92;dots&#92;leq p_r' title='p_1&#92;leq p_2&#92;leq&#92;dots&#92;leq p_r' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=q_1%5Cleq+q_2%5Cleq%5Cdots%5Cleq+q_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_1&#92;leq q_2&#92;leq&#92;dots&#92;leq q_s' title='q_1&#92;leq q_2&#92;leq&#92;dots&#92;leq q_s' class='latex' /> be primes such that <img src='http://s0.wp.com/latex.php?latex=p_1%5Cdots+p_r%3Dq_1%5Cdots+q_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1&#92;dots p_r=q_1&#92;dots q_s' title='p_1&#92;dots p_r=q_1&#92;dots q_s' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=r%3Ds&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r=s' title='r=s' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=p_i%3Dq_i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_i=q_i' title='p_i=q_i' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=i%5Cin%5C%7B1%2C2%2C%5Cdots%2Cr%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='i&#92;in&#92;{1,2,&#92;dots,r&#92;}' title='i&#92;in&#92;{1,2,&#92;dots,r&#92;}' class='latex' />.</p>
<p>Another way of stating the claim would be in terms of expressions of the form <img src='http://s0.wp.com/latex.php?latex=p_1%5E%7Ba_1%7D%5Cdots+p_r%5E%7Ba_r%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1^{a_1}&#92;dots p_r^{a_r}' title='p_1^{a_1}&#92;dots p_r^{a_r}' class='latex' /> where the <img src='http://s0.wp.com/latex.php?latex=p_i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_i' title='p_i' class='latex' /> are strictly increasing. What matters is simply to do justice to the informal idea that the product is &#8220;unique apart from the order&#8221;. </p>
<p>OK, let&#8217;s see what we can do with Claim 1.</p>
<p>2. <em>See what you can say about a minimal counterexample.</em></p>
<p>When you first stare at the statement &#8220;<img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> can be written as a product of primes&#8221; there seems to be nothing much to go on. Where do those primes come from? If you&#8217;re faced with a huge number <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> that isn&#8217;t obviously a multiple of some small prime like 2, 3 or 5, then the following strategy for proving that it has a prime factorization fails miserably.</p>
<p>(i) Define some numbers <img src='http://s0.wp.com/latex.php?latex=p_1%2C%5Cdots%2Cp_k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1,&#92;dots,p_k' title='p_1,&#92;dots,p_k' class='latex' /> in terms of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />.</p>
<p>(ii) Prove that all the <img src='http://s0.wp.com/latex.php?latex=p_i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_i' title='p_i' class='latex' /> are prime and that their product is equal to <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />.</p>
<p>That&#8217;s the kind of strategy I might use if I wanted to find integers <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=a%2Bb%3Dn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a+b=n' title='a+b=n' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%7Ca-b%7C%5Cleq+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a-b|&#92;leq 1' title='|a-b|&#92;leq 1' class='latex' />. I&#8217;d define <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> to be <img src='http://s0.wp.com/latex.php?latex=%5Clfloor+n%2F2%5Crfloor&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lfloor n/2&#92;rfloor' title='&#92;lfloor n/2&#92;rfloor' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> to be <img src='http://s0.wp.com/latex.php?latex=%5Clceil+n%2F2%5Crceil&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lceil n/2&#92;rceil' title='&#92;lceil n/2&#92;rceil' class='latex' /> (these are the greatest integer that&#8217;s at most <img src='http://s0.wp.com/latex.php?latex=n%2F2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/2' title='n/2' class='latex' /> and the least integer that&#8217;s at least <img src='http://s0.wp.com/latex.php?latex=n%2F2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/2' title='n/2' class='latex' />, respectively), and then I&#8217;d argue that <img src='http://s0.wp.com/latex.php?latex=%7Ca-b%7C%5Cleq+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a-b|&#92;leq 1' title='|a-b|&#92;leq 1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=a%2Bb%3Dn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a+b=n' title='a+b=n' class='latex' />, probably by splitting into the cases where <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is odd and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is even. But for finding some numbers that multiply together to give <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, it just doesn&#8217;t seem to work.</p>
<p>So what other options have we got? Our problem is that we don&#8217;t seem to have any information to use. But we can give ourselves a huge boost if we use one of the most fundamental techniques known to mathematicians, namely induction. </p>
<p>With that hint, you might be tempted to try the following strategy, which is, I&#8217;m afraid, still pretty hopeless.</p>
<p>(i) Assume that <img src='http://s0.wp.com/latex.php?latex=n-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n-1' title='n-1' class='latex' /> has a factorization into primes.</p>
<p>(ii) Attempt to prove that <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> has a factorization into primes.</p>
<p>Why is this hopeless? Because the multiplicative structure of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> has nothing to do with the multiplicative structure of <img src='http://s0.wp.com/latex.php?latex=n-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n-1' title='n-1' class='latex' />. </p>
<p>So why did I suggest induction? Because there are other forms of induction that do much better. One is to exploit the observation that if you are trying to prove the <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> case, then you can, if you want, use not just the <img src='http://s0.wp.com/latex.php?latex=n-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n-1' title='n-1' class='latex' /> case but <em>all</em> previous cases. In our problem, we are free to assume that <em>every positive integer less than <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> can be factorized into primes</em>. Aha, that looks as though it could be useful. </p>
<p>But why is it useful? How can we use prime factorizations of smaller numbers to obtain a prime factorization of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />? There are two closely related ways that people often use. The first is simply to describe the method that we tend to use when finding prime factorizations of small numbers. We look for the smallest prime that goes into <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, then we divide by that and repeat the process.</p>
<p>The fact that I just said &#8220;and repeat the process&#8221; is a sign that to make the argument formal I should use induction. (Another phrase that also alerts us to that is &#8220;and so on&#8221;.) And indeed I can, as long as I use the strong form just mentioned. Here&#8217;s how the argument might go.</p>
<p><strong>Proof 1.</strong>Let <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> be a positive integer. If <img src='http://s0.wp.com/latex.php?latex=n%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=1' title='n=1' class='latex' /> then we can write it as an &#8220;empty product&#8221; of primes. [This is not the important part of the proof.] Now let&#8217;s look for the smallest factor we can. There is no need to look for composite factors, since if <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> is composite and <img src='http://s0.wp.com/latex.php?latex=a%7Cn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a|n' title='a|n' class='latex' /> then the factors of <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />, which are smaller, also divide <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />. Therefore, it suffices to check through all the primes, in increasing order. If we find a prime <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=p%7Cn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|n' title='p|n' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=n%2Fp&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n/p' title='n/p' class='latex' /> is a positive integer that&#8217;s smaller than <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, so it can be written in the form <img src='http://s0.wp.com/latex.php?latex=p_1p_2%5Cdots+p_r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1p_2&#92;dots p_r' title='p_1p_2&#92;dots p_r' class='latex' />. But then <img src='http://s0.wp.com/latex.php?latex=n%3Dpp_1%5Cdots+p_r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=pp_1&#92;dots p_r' title='n=pp_1&#92;dots p_r' class='latex' />. If we don&#8217;t find a prime <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> has no non-trivial factors, which implies that <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is itself a prime. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p>In the course of writing that, we might notice that it didn&#8217;t really matter that <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> was prime, since all we needed was a number that had a prime factorization. So here&#8217;s a shorter version of the same argument.</p>
<p><strong>Proof 2.</strong> Let <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> be a positive integer. If <img src='http://s0.wp.com/latex.php?latex=n%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=1' title='n=1' class='latex' /> then it can be written as an empty product of primes. If <img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq 2' title='n&#92;geq 2' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> has no factors other than 1 and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is prime. Otherwise, <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> can be written as <img src='http://s0.wp.com/latex.php?latex=ab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab' title='ab' class='latex' /> with both <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> strictly between 1 and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />. But then, by strong induction, we can assume that both <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> can be factorized into primes, from which it follows (putting the two factorizations together) that <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> can as well. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p>The slogan that headed this section was &#8220;See what you can say about a minimal counterexample.&#8221; That is sort of what we were doing above. Here&#8217;s a way of making that more explicit. I&#8217;m using the well-ordering principle instead of induction.</p>
<p><strong>Proof 3.</strong> Suppose that there is a positive integer with no prime factorization. Then there must be a least such positive integer, <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, say. Since empty products and primes themselves count as products of primes, <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> must be composite, so let us write <img src='http://s0.wp.com/latex.php?latex=n%3Dab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=ab' title='n=ab' class='latex' />. Then by the minimality of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> can be written as products of primes, from which it follows that <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> can be written as a product of primes, contradicting our initial assumption. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p>Although thinking about minimal counterexamples is great for coming up with proofs, I don&#8217;t actually like the argument that results in this case, since it doesn&#8217;t use the hypothesis that <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> can&#8217;t be written as a product of primes in a strong way. The evidence for that is in Proof 2, which is essentially the same argument but with no need for any contradiction.</p>
<p>Right, that&#8217;s done existence. Let&#8217;s turn to uniqueness.</p>
<p>3. <em>See what you can say about a minimal counterexample.</em></p>
<p>I did say that thinking about minimal counterexamples was good for coming up with arguments, so let&#8217;s see how we get on here. A <em>counterexample</em> to the uniqueness would be an equality of the form <img src='http://s0.wp.com/latex.php?latex=p_1%5Cdots+p_r%3Dq_1%5Cdots+q_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1&#92;dots p_r=q_1&#92;dots q_s' title='p_1&#92;dots p_r=q_1&#92;dots q_s' class='latex' /> with the two sequences of primes not the same up to a permutation. What do we mean by a <em>minimal</em> counterexample? There&#8217;s a bit of choice here: we might mean that the number <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> that equals both those products is as small as possible, but we might decide that we&#8217;re minimizing <img src='http://s0.wp.com/latex.php?latex=r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r' title='r' class='latex' />. As it happens, both choices work for what I want to say next, which is that we can see immediately that no prime can appear on both sides of the equation. Why not? Because if it did, then we could strike out that prime from both sides (by dividing by it) and we would have two smaller products that were still equal and still distinct sequences of primes. So we&#8217;ve got immediately that no <img src='http://s0.wp.com/latex.php?latex=p_i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_i' title='p_i' class='latex' /> is equal to any <img src='http://s0.wp.com/latex.php?latex=q_j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_j' title='q_j' class='latex' />. </p>
<p>Now what? Having reduced to the case where no <img src='http://s0.wp.com/latex.php?latex=p_i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_i' title='p_i' class='latex' /> is equal to any <img src='http://s0.wp.com/latex.php?latex=q_j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_j' title='q_j' class='latex' /> we appear to be stuck: there is no longer any obvious way of making the example smaller.</p>
<p>It&#8217;s time for another standard proof-finding technique.</p>
<p>4. <em>Identify the simplest case you can&#8217;t prove.</em></p>
<p>This is not a completely precise piece of advice, because problems don&#8217;t always split neatly into cases, and even if they do, they may do so in more than one way. Nevertheless, it often happens that a simplified version of a problem is easy enough that you can solve it, while also being difficult enough that the solution gives you important clues about how to prove the original problem.</p>
<p>What could count as a &#8220;case&#8221; of the problem we are trying to solve? Well, the problem we&#8217;re now thinking about asks us to show that a product of <img src='http://s0.wp.com/latex.php?latex=r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r' title='r' class='latex' /> primes can never equal a product of <img src='http://s0.wp.com/latex.php?latex=s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='s' title='s' class='latex' /> entirely different primes. So an obvious &#8220;simple case&#8221; is to take <img src='http://s0.wp.com/latex.php?latex=r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r' title='r' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='s' title='s' class='latex' /> to be small.</p>
<p>The advice is to find the simplest case that we can&#8217;t prove. So let&#8217;s start with the simplest case we can, <img src='http://s0.wp.com/latex.php?latex=r%3Ds%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r=s=0' title='r=s=0' class='latex' />, and work up. Well, that&#8217;s a bit <em>too</em> simple, because we can&#8217;t have two distinct products of length 0. So how about <img src='http://s0.wp.com/latex.php?latex=r%3D0%2Cs%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r=0,s=1' title='r=0,s=1' class='latex' />? That&#8217;s trivial: 1 is not equal to any prime. In fact, <img src='http://s0.wp.com/latex.php?latex=r%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r=0' title='r=0' class='latex' /> is trivial whatever <img src='http://s0.wp.com/latex.php?latex=s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='s' title='s' class='latex' /> is, so let&#8217;s try <img src='http://s0.wp.com/latex.php?latex=r%3D1%2Cs%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r=1,s=1' title='r=1,s=1' class='latex' />. That&#8217;s now asking us to prove that if <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q' title='q' class='latex' /> are distinct primes, then <img src='http://s0.wp.com/latex.php?latex=p%5Cne+q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p&#92;ne q' title='p&#92;ne q' class='latex' />. I think we can manage that. How about <img src='http://s0.wp.com/latex.php?latex=r%3D1%2Cs%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r=1,s=2' title='r=1,s=2' class='latex' />. That&#8217;s asking us to show that if <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is a prime and <img src='http://s0.wp.com/latex.php?latex=q_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_1' title='q_1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=q_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_2' title='q_2' class='latex' /> are primes not equal to <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=p%5Cne+q_1q_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p&#92;ne q_1q_2' title='p&#92;ne q_1q_2' class='latex' />. Still easy, since <img src='http://s0.wp.com/latex.php?latex=q_1q_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_1q_2' title='q_1q_2' class='latex' /> isn&#8217;t a prime. In fact, that argument shows that even <img src='http://s0.wp.com/latex.php?latex=r%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r=1' title='r=1' class='latex' /> is not interesting. So that brings us to <img src='http://s0.wp.com/latex.php?latex=r%3D2%2Cs%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r=2,s=2' title='r=2,s=2' class='latex' />. Is it obvious that <img src='http://s0.wp.com/latex.php?latex=p_1p_2%5Cne+q_1q_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1p_2&#92;ne q_1q_2' title='p_1p_2&#92;ne q_1q_2' class='latex' /> if no <img src='http://s0.wp.com/latex.php?latex=p_i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_i' title='p_i' class='latex' /> is equal to any <img src='http://s0.wp.com/latex.php?latex=q_j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_j' title='q_j' class='latex' />?</p>
<p>If you&#8217;re inclined to answer yes at this point, then please read <a href="http://gowers.wordpress.com/2011/11/13/why-isnt-the-fundamental-theorem-of-arithmetic-obvious/">my previous post</a>. It&#8217;s not obvious at all, and needs a proof.</p>
<p>Unfortunately, having identified the first case we can&#8217;t solve, we&#8217;re still stuck. Why shouldn&#8217;t <img src='http://s0.wp.com/latex.php?latex=p_1p_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1p_2' title='p_1p_2' class='latex' /> equal <img src='http://s0.wp.com/latex.php?latex=q_1q_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_1q_2' title='q_1q_2' class='latex' />? How are we supposed to prove this?</p>
<p>It&#8217;s time for another standard proof-finding technique.</p>
<p>5. <em>Identify the simplest case you can&#8217;t prove.</em></p>
<p>That&#8217;s not a misprint. We&#8217;ve got a new hard problem, so why not try the same technique again? After all, there is quite a simple way of generating special cases, even for this problem, since we can look at small primes.</p>
<p>Now I don&#8217;t mean by this the silly idea of seeing whether we can prove something like that <img src='http://s0.wp.com/latex.php?latex=2%5Ctimes+2%5Cne+3%5Ctimes+3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2&#92;times 2&#92;ne 3&#92;times 3' title='2&#92;times 2&#92;ne 3&#92;times 3' class='latex' />. Whatever a &#8220;special case&#8221; is, it will need to be a general statement of some kind. The easiest way to achieve that is to let at least one of the primes be a variable. Now a moment&#8217;s reflection tells us that if we&#8217;ve got a fixed number on one side then it&#8217;s still not a suitable special case. For example, we can show that <img src='http://s0.wp.com/latex.php?latex=2%5Ctimes+2%5Cne+3%5Ctimes+q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2&#92;times 2&#92;ne 3&#92;times q' title='2&#92;times 2&#92;ne 3&#92;times q' class='latex' /> by simply observing that if <img src='http://s0.wp.com/latex.php?latex=q%5Cne+2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q&#92;ne 2' title='q&#92;ne 2' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=3q%3E4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3q&gt;4' title='3q&gt;4' class='latex' />. That doesn&#8217;t feel as though it is telling us anything.</p>
<p>So let&#8217;s try showing that <img src='http://s0.wp.com/latex.php?latex=2p%5Cne+3q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2p&#92;ne 3q' title='2p&#92;ne 3q' class='latex' /> when <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q' title='q' class='latex' /> are distinct primes with <img src='http://s0.wp.com/latex.php?latex=p%5Cne+3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p&#92;ne 3' title='p&#92;ne 3' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=q%5Cne+2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q&#92;ne 2' title='q&#92;ne 2' class='latex' />. Is that an easy problem? Yes it is, since <img src='http://s0.wp.com/latex.php?latex=2p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2p' title='2p' class='latex' /> is even and <img src='http://s0.wp.com/latex.php?latex=3q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3q' title='3q' class='latex' /> is odd (because <img src='http://s0.wp.com/latex.php?latex=q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q' title='q' class='latex' /> is a prime not equal to 2). </p>
<p>Does that count as an informative proof? It&#8217;s a bit early to say, but it has the promising feature that it involves divisibility. It&#8217;s time for another proof-discovery technique.</p>
<p>6. <em>If you manage to prove anything at all, then try to obtain the strongest result that follows from your proof method.</em></p>
<p>This general idea is the reason that looking at special cases is sometimes worth it. What was our proof? It was that <img src='http://s0.wp.com/latex.php?latex=2p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2p' title='2p' class='latex' /> is even, and <img src='http://s0.wp.com/latex.php?latex=3q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3q' title='3q' class='latex' /> is odd. But what did we use in order to prove that <img src='http://s0.wp.com/latex.php?latex=3q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3q' title='3q' class='latex' /> was odd? We used the fact that 3 is odd, that <img src='http://s0.wp.com/latex.php?latex=q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q' title='q' class='latex' /> is odd, and that a product of two odd numbers is odd. Can we generalize our result? </p>
<p>Obviously yes, since if all we used about 3 was that it is odd, then we can replace 3 by any other odd prime. So a first generalization is that <img src='http://s0.wp.com/latex.php?latex=2p%5Cne+q_1q_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2p&#92;ne q_1q_2' title='2p&#92;ne q_1q_2' class='latex' /> whenever <img src='http://s0.wp.com/latex.php?latex=q_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_1' title='q_1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=q_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_2' title='q_2' class='latex' /> are odd primes. (The condition that they should not equal <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is not necessary &#8212; it just stops the problem being trivial.) But that&#8217;s not all we can do. The result that a product of two odd numbers is odd has an easy generalization to the result that a product of <em>any</em> number of odd numbers is odd. So if we use that, then we obtain the result that <img src='http://s0.wp.com/latex.php?latex=2p%5Cne+q_1q_2%5Cdots+q_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2p&#92;ne q_1q_2&#92;dots q_s' title='2p&#92;ne q_1q_2&#92;dots q_s' class='latex' /> whenever <img src='http://s0.wp.com/latex.php?latex=q_1%2C%5Cdots%2Cq_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_1,&#92;dots,q_s' title='q_1,&#92;dots,q_s' class='latex' /> are odd primes. </p>
<p>Have we finished? No we haven&#8217;t, since all we needed about the left-hand side was that it was even. So we can replace <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> by an arbitrary product of primes. That means that we have proved the general result we are trying to prove &#8212; that <img src='http://s0.wp.com/latex.php?latex=p_1%5Cdots+p_r%5Cne+q_1%5Cdots+q_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1&#92;dots p_r&#92;ne q_1&#92;dots q_s' title='p_1&#92;dots p_r&#92;ne q_1&#92;dots q_s' class='latex' /> &#8212; provided that <img src='http://s0.wp.com/latex.php?latex=p_1%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1=2' title='p_1=2' class='latex' />. </p>
<p>Now have we finished? Well, we&#8217;ve managed to prove the whole result when one of the primes is equal to 2. Can we use the same method for other primes? What would be required? </p>
<p>Well, the fact that drove the entire argument was this: a product of two odd numbers is an odd number, which implies that a product of any number of odd numbers is an odd number. What would be the corresponding fact we would have to use when <img src='http://s0.wp.com/latex.php?latex=p_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1' title='p_1' class='latex' /> is a general prime?</p>
<p>The two numbers we want to be distinct are <img src='http://s0.wp.com/latex.php?latex=p_1%5Cdots+p_r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1&#92;dots p_r' title='p_1&#92;dots p_r' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=q_1%5Cdots+q_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_1&#92;dots q_s' title='q_1&#92;dots q_s' class='latex' />. What is the analogue for a general <img src='http://s0.wp.com/latex.php?latex=p_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1' title='p_1' class='latex' /> of the concept of evenness? It is divisibility by <img src='http://s0.wp.com/latex.php?latex=p_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1' title='p_1' class='latex' />. So if we want to copy the argument that worked when <img src='http://s0.wp.com/latex.php?latex=p_1%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1=2' title='p_1=2' class='latex' />, we should start by observing that <img src='http://s0.wp.com/latex.php?latex=p_1%5Cdots+p_r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1&#92;dots p_r' title='p_1&#92;dots p_r' class='latex' /> is divisible by <img src='http://s0.wp.com/latex.php?latex=p_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1' title='p_1' class='latex' /> and we should aim to prove that <img src='http://s0.wp.com/latex.php?latex=q_1%5Cdots+q_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_1&#92;dots q_s' title='q_1&#92;dots q_s' class='latex' /> is <em>not</em> divisible by <img src='http://s0.wp.com/latex.php?latex=p_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1' title='p_1' class='latex' />. What information do we have to go on? We know that no <img src='http://s0.wp.com/latex.php?latex=p_i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_i' title='p_i' class='latex' /> is equal to any <img src='http://s0.wp.com/latex.php?latex=q_j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_j' title='q_j' class='latex' />, but if we bear in mind what we did when <img src='http://s0.wp.com/latex.php?latex=p_1%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1=2' title='p_1=2' class='latex' /> we will expect that all that really matters is that no <img src='http://s0.wp.com/latex.php?latex=q_j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_j' title='q_j' class='latex' /> is equal to <img src='http://s0.wp.com/latex.php?latex=p_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1' title='p_1' class='latex' />. </p>
<p>So now we&#8217;ve reduced our problem to the following. If none of <img src='http://s0.wp.com/latex.php?latex=q_1%2C%5Cdots%2Cq_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_1,&#92;dots,q_s' title='q_1,&#92;dots,q_s' class='latex' /> is equal to <img src='http://s0.wp.com/latex.php?latex=p_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1' title='p_1' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=q_1%5Cdots+q_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_1&#92;dots q_s' title='q_1&#92;dots q_s' class='latex' /> is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1' title='p_1' class='latex' />. And if we want to be a bit bolder, we could spot that when <img src='http://s0.wp.com/latex.php?latex=p_1%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1=2' title='p_1=2' class='latex' /> we didn&#8217;t even use the fact that the <img src='http://s0.wp.com/latex.php?latex=q_i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_i' title='q_i' class='latex' /> were prime &#8212; we just used the fact that they were odd. So we might expect that the right thing to try to prove is that if all the <img src='http://s0.wp.com/latex.php?latex=q_i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_i' title='q_i' class='latex' /> are positive integers and none of them is a multiple of <img src='http://s0.wp.com/latex.php?latex=p_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1' title='p_1' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=q_1%5Cdots+q_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_1&#92;dots q_s' title='q_1&#92;dots q_s' class='latex' /> is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1' title='p_1' class='latex' />.</p>
<p>An advantage of this final formulation is that it follows straightforwardly from the case <img src='http://s0.wp.com/latex.php?latex=s%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='s=2' title='s=2' class='latex' />. That is, we will be done if we can prove the following result: whenever <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> are non-multiples of <img src='http://s0.wp.com/latex.php?latex=p_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1' title='p_1' class='latex' />, then so is <img src='http://s0.wp.com/latex.php?latex=ab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab' title='ab' class='latex' />. Turning to the contrapositive, and forgetting the suffix 1, which is no longer playing a useful role, we find that the fundamental theorem of arithmetic is reduced to the following statement.</p>
<p><strong>Claim.</strong> <em>Let <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> be a prime and let <img src='http://s0.wp.com/latex.php?latex=ab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab' title='ab' class='latex' /> be positive integers such that <img src='http://s0.wp.com/latex.php?latex=p%7Cab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|ab' title='p|ab' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=p%7Ca&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|a' title='p|a' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=p%7Cb&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|b' title='p|b' class='latex' />.</em></p>
<p>However, this statement itself does not seem all that easy to prove, so it&#8217;s time to go back to our favourite method for getting an idea. Here it is again.</p>
<p>7. <em>Identify the simplest case you can&#8217;t prove.</em></p>
<p>A natural notion of &#8220;case&#8221; is obtained if we think about what prime we are talking about. We&#8217;ve already done the case <img src='http://s0.wp.com/latex.php?latex=p%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p=2' title='p=2' class='latex' />, so how about the case <img src='http://s0.wp.com/latex.php?latex=p%3D3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p=3' title='p=3' class='latex' />? If <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> are non-multiples of 3, does it follow that <img src='http://s0.wp.com/latex.php?latex=ab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab' title='ab' class='latex' /> is a non-multiple of 3? Yes it does, and here&#8217;s a quick reason for that. Let&#8217;s consider what <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> are mod 3. Since they are non-multiples of 3, they are both equal either to 1 or to 2. But <img src='http://s0.wp.com/latex.php?latex=1%5Ctimes+1%3D2%5Ctimes+2%5Cequiv+1%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1&#92;times 1=2&#92;times 2&#92;equiv 1,' title='1&#92;times 1=2&#92;times 2&#92;equiv 1,' class='latex' /> <img src='http://s0.wp.com/latex.php?latex=1%5Ctimes+2%3D2%5Ctimes+1%5Cequiv+2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1&#92;times 2=2&#92;times 1&#92;equiv 2' title='1&#92;times 2=2&#92;times 1&#92;equiv 2' class='latex' />, so we never get a multiple of 3. </p>
<p>Note that that was a generalization of the proof that a product of odd numbers is odd, which is encouraging. Indeed, if you stop to think about it, you&#8217;ll see that for any <em>given</em> prime <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, there is in principle a proof of what we want: you just calculate the multiplication table mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> and check that there are no non-trivial zero divisors.</p>
<p>A bit less encouraging is that as <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> gets bigger, the proof you write down gets longer. What&#8217;s more, it doesn&#8217;t give us any insight into why the result is true: we just work out the multiplication table, keeping our fingers crossed that we won&#8217;t find zero divisors. What is lacking is any <em>pattern</em> to the results, or a <em>general reason</em> for the divisors not to exist.</p>
<p>8. <em>Try to draw analogies with other contexts.</em></p>
<p>Maybe by this stage you know what to do: if you know the proof that there are no non-trivial zero divisors mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, then you are finished. But suppose you didn&#8217;t know. Then you might once again have that stuck feeling. Another good way of getting out of that situation is to look at similar problems where you <em>do</em> know what to do. (This is an incredibly important method used in mathematical research &#8212; I&#8217;m tempted to say that it is the most important method we have.)</p>
<p>What, then, does the phrase &#8220;no zero divisors&#8221; suggest to you? Perhaps the familiar result that if <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> are two non-zero real numbers, then <img src='http://s0.wp.com/latex.php?latex=xy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='xy' title='xy' class='latex' /> is also non-zero. How do we prove that? We say that if <img src='http://s0.wp.com/latex.php?latex=xy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='xy' title='xy' class='latex' /> <em>is</em> zero, and if <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is not zero, then we can divide both sides by <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and deduce that <img src='http://s0.wp.com/latex.php?latex=y%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=0' title='y=0' class='latex' />.</p>
<p>OK, back to what we&#8217;re actually trying to prove. We&#8217;ve got two numbers <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=ab%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab&#92;equiv 0' title='ab&#92;equiv 0' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> and would like to deduce that either <img src='http://s0.wp.com/latex.php?latex=a%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;equiv 0' title='a&#92;equiv 0' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=b%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;equiv 0' title='b&#92;equiv 0' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. The real-number result suggests that we should try to divide both sides of <img src='http://s0.wp.com/latex.php?latex=ab%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab&#92;equiv 0' title='ab&#92;equiv 0' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> (unless <img src='http://s0.wp.com/latex.php?latex=a%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;equiv 0' title='a&#92;equiv 0' class='latex' />). Can we do that? Well, we don&#8217;t exactly have division, so let&#8217;s think a bit further.  </p>
<p>What <em>is</em> division of real numbers? The simplest way of defining it is perhaps this. First, we establish that every non-zero real number has a multiplicative inverse (or reciprocal), and then we define <img src='http://s0.wp.com/latex.php?latex=x%2Fy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x/y' title='x/y' class='latex' /> to be <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> mutliplied by the inverse of <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />. </p>
<p>This way of looking at things gives us something we can take back to the mod-<img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> context. We start with the assumption that <img src='http://s0.wp.com/latex.php?latex=ab%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab&#92;equiv 0' title='ab&#92;equiv 0' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. We&#8217;d now like to multiply both sides of this equation by a multiplicative inverse of <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />, assuming such a thing exists. If we can find <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=ca%5Cequiv+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ca&#92;equiv 1' title='ca&#92;equiv 1' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, then from the fact that <img src='http://s0.wp.com/latex.php?latex=ab%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab&#92;equiv 0' title='ab&#92;equiv 0' class='latex' /> we will be able to deduce that <img src='http://s0.wp.com/latex.php?latex=b%5Cequiv+1b%5Cequiv+cab%5Cequiv+c0%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;equiv 1b&#92;equiv cab&#92;equiv c0&#92;equiv 0' title='b&#92;equiv 1b&#92;equiv cab&#92;equiv c0&#92;equiv 0' class='latex' />.</p>
<p>So now the whole thing has boiled down to the statement that if <img src='http://s0.wp.com/latex.php?latex=a%5Cnot%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;not&#92;equiv 0' title='a&#92;not&#92;equiv 0' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> then there exists some <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=ca%5Cequiv+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ca&#92;equiv 1' title='ca&#92;equiv 1' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />.</p>
<p>Just before we continue, I want to stress that we have actually done something here. Sometimes a reformulation of a statement produces a new statement that is so obviously equivalent to the original statement that it doesn&#8217;t get you any closer to proving it. An example would be what we did a little earlier when we reformulated this</p>
<li><em>Let <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> be a prime number and let <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> be positive integers. Then if <img src='http://s0.wp.com/latex.php?latex=p%7Cab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|ab' title='p|ab' class='latex' /> it follows that <img src='http://s0.wp.com/latex.php?latex=p%7Ca&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|a' title='p|a' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=p%7Cb&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|b' title='p|b' class='latex' />.</em></li>
<p>as this</p>
<li><em>Let <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> be a prime number and let <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> be positive integers such that <img src='http://s0.wp.com/latex.php?latex=ab%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab&#92;equiv 0' title='ab&#92;equiv 0' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. Then either <img src='http://s0.wp.com/latex.php?latex=a%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;equiv 0' title='a&#92;equiv 0' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=b%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;equiv 0' title='b&#92;equiv 0' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />.</em></li>
<p>That achieves almost nothing, because <img src='http://s0.wp.com/latex.php?latex=p%7Cn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|n' title='p|n' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;equiv 0' title='n&#92;equiv 0' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> are basically the same statement. However, it does give us a nudge in the direction of the next reformulation, which is genuinely different.</p>
<li><em>Let <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> be a prime number and let <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> be a positive integer such that <img src='http://s0.wp.com/latex.php?latex=a%5Cnot%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;not&#92;equiv 0' title='a&#92;not&#92;equiv 0' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. Then there exists <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=ca%5Cequiv+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ca&#92;equiv 1' title='ca&#92;equiv 1' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />.</em></li>
<p>This last reformulation is different because it gives us <em>a strategy for showing that <img src='http://s0.wp.com/latex.php?latex=b%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;equiv 0' title='b&#92;equiv 0' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /></em>. Once we&#8217;ve come up with this useful number <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' />, we&#8217;ll be done.</p>
<p>But how do we come up with <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' />?</p>
<p>9. <em>If there&#8217;s another way of looking at something, then even if it isn&#8217;t a substantial reformulation, it may make the problem easier to think about.</em></p>
<p>In this case, we could try translating back from mod-<img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> language to ordinary integer language. Here&#8217;s the reformulation that results.</p>
<li><em>Let <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> be a prime number and let <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> be a positive integer that is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. Show that there is an integer <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=ca&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ca' title='ca' class='latex' /> leaves a remainder of 1 when you divide by <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />.</em></li>
<p>Hmm &#8230; that last bit doesn&#8217;t feel like enough of a reformulation. What does it mean to say that <img src='http://s0.wp.com/latex.php?latex=ca&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ca' title='ca' class='latex' /> leaves a remainder of 1 when you divide by <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />? It means that <img src='http://s0.wp.com/latex.php?latex=ca&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ca' title='ca' class='latex' /> is 1 more than a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. So let&#8217;s tidy up the latest reformulation.</p>
<li><em>Let <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> be a prime number and let <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> be a positive integer that is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. Show that there is an integer <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' /> and another integer <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=ca%3Dmp%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ca=mp+1' title='ca=mp+1' class='latex' />.</em></li>
<p>10. <em>If anything you see reminds you of anything you know, then pounce.</em></p>
<p>We&#8217;re given two integers <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, about which we know that <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is prime and <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. We are trying to find <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=ca%3Dmp%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ca=mp+1' title='ca=mp+1' class='latex' />. Does this look like anything we&#8217;ve seen before? Yes it does. It&#8217;s very similar to B&eacute;zout&#8217;s theorem. Indeed, if we replace <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=-m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='-m' title='-m' class='latex' />, then our task is to solve the equation <img src='http://s0.wp.com/latex.php?latex=ca%2Bmp%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ca+mp=1' title='ca+mp=1' class='latex' />, and B&eacute;zout&#8217;s theorem tells us we can do this as long as <img src='http://s0.wp.com/latex.php?latex=%28a%2Cp%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,p)=1' title='(a,p)=1' class='latex' />. But <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is prime, so <img src='http://s0.wp.com/latex.php?latex=%28a%2Cp%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,p)' title='(a,p)' class='latex' /> has to be 1 or <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. Since it&#8217;s not <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> (by assumption) it must be 1, and we are done. </p>
<p>That&#8217;s a proof discovery process assuming you already know B&eacute;zout&#8217;s theorem. But note that the process did not start with the hint, &#8220;By the way, B&eacute;zout&#8217;s theorem is useful.&#8221; Rather, the need for B&eacute;zout&#8217;s theorem arose naturally. So even if you don&#8217;t know B&eacute;zout&#8217;s theorem, at least you can still arrive at the <em>statement</em> of the theorem and recognise that once you&#8217;ve proved it you can deduce the fundamental theorem of arithmetic.</p>
<p>What about B&eacute;zout&#8217;s theorem though? How might you come up with a proof of <em>that</em>? There are many possible ways &#8212; I&#8217;ll describe one.</p>
<p>Let&#8217;s revisit one of the recent problem-solving tips.</p>
<p>11. <em>If there&#8217;s another way of looking at something, then even if it isn&#8217;t a substantial reformulation, it may make the problem easier to think about.</em></p>
<p>OK, we&#8217;ve got a prime <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> and a number <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> that is not congruent to 0 mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. We are trying to find a multiplicative inverse for <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. That is, we are trying to solve the equation <img src='http://s0.wp.com/latex.php?latex=ca%5Cequiv+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ca&#92;equiv 1' title='ca&#92;equiv 1' class='latex' />. </p>
<p>But let&#8217;s remember why we were trying to find <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' />. The reason was that we wanted a <em>cancellation law</em>. That is, we wanted to be able to say that if <img src='http://s0.wp.com/latex.php?latex=ax%5Cequiv+ay&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ax&#92;equiv ay' title='ax&#92;equiv ay' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=x%5Cequiv+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;equiv y' title='x&#92;equiv y' class='latex' />, or rather we wanted to be able to do that in the special case where <img src='http://s0.wp.com/latex.php?latex=x%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=b' title='x=b' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=0' title='y=0' class='latex' />. (It was a bit silly to call <img src='http://s0.wp.com/latex.php?latex=x%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=b' title='x=b' class='latex' /> a &#8220;special case&#8221;, since <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> is arbitrary, but let&#8217;s not worry about that. Certainly, restricting to <img src='http://s0.wp.com/latex.php?latex=y%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=0' title='y=0' class='latex' /> is a special case.)</p>
<p>Now let&#8217;s see whether <img src='http://s0.wp.com/latex.php?latex=ax%3Day%5Cimplies+x%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ax=ay&#92;implies x=y' title='ax=ay&#92;implies x=y' class='latex' /> makes us think of anything. It certainly should if you&#8217;ve read and digested the post in this series about properties of functions. It looks very like the definition of an injection. In fact, it <em>is</em> the definition of an injection for the function that takes <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=ax&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ax' title='ax' class='latex' /> (where I&#8217;m thinking of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=ax&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ax' title='ax' class='latex' /> as elements of the group of integers mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />). </p>
<p>Let&#8217;s denote this group by <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_p' title='&#92;mathbb{Z}_p' class='latex' />. We&#8217;ve just defined a function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BZ%7D_p%5Cto%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{Z}_p&#92;to&#92;mathbb{Z}_p' title='f:&#92;mathbb{Z}_p&#92;to&#92;mathbb{Z}_p' class='latex' /> by setting <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dax&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=ax' title='f(x)=ax' class='latex' />. In this terminology, our aim is to prove that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection.</p>
<p>Now injections from finite sets to themselves have a remarkable property, though you may not find it all that remarkable: they are also surjections. To see that this statement is saying at least <em>something</em>, note that it is quite clearly false for infinite sets. For instance, the function that takes <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=n%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n+1' title='n+1' class='latex' /> is an injection from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' /> that never takes the value 1. Informally, the reason it is true for finite sets is that if <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is finite and <img src='http://s0.wp.com/latex.php?latex=f%3AX%5Cto+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:X&#92;to X' title='f:X&#92;to X' class='latex' /> is an function, then the only way that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> can send distinct elements to distinct elements is if it uses up the whole of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> as its image. Equivalently, there is no injection from <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cn%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,n&#92;}' title='&#92;{1,2,&#92;dots,n&#92;}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cn-1%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,n-1&#92;}' title='&#92;{1,2,&#92;dots,n-1&#92;}' class='latex' /> for any positive integer <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />. To prove this formally is a good exercise in the use of induction, though it may well have been a lemma in the course. And you should be able to deduce easily from this result that if <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is finite and <img src='http://s0.wp.com/latex.php?latex=f%3AX%5Cto+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:X&#92;to X' title='f:X&#92;to X' class='latex' /> then all the three properties &#8220;is an injection&#8221;, &#8220;is a surjection&#8221; and &#8220;is a bijection&#8221; are equivalent.</p>
<p>When an equivalence is true but not completely trivial to prove, it is almost always useful. Here, it doesn&#8217;t seem to get us anywhere. After all, we started by wanting to show that we can find <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=ca%5Cequiv+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ca&#92;equiv 1' title='ca&#92;equiv 1' class='latex' /> and have ended up by saying that it will be enough to prove that the function that takes <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=ax&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ax' title='ax' class='latex' /> (which equals <img src='http://s0.wp.com/latex.php?latex=xa&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='xa' title='xa' class='latex' />) is a surjection. Well, of course that will be enough! We want to prove that it takes the value 1 &#8212; it surely doesn&#8217;t make our task easier to show that it takes <em>all</em> possible values. </p>
<p>That, however, is not a valid argument. Often it <em>is</em> easier to prove a stronger result, because you have less room for manoeuvre, and the less choice you have about what to do, the easier it is to work out what to do. It&#8217;s a bit like having only one way of getting out of check: whatever the disadvantages may be, at least there is no problem working out what your best move is. And here we know that it isn&#8217;t a real disadvantage because we&#8217;ve managed to establish, for our particular function, that it takes the value 1 somewhere if and only if it is a surjection.</p>
<p>That still doesn&#8217;t really give us much idea of why it could be advantageous to go for the stronger-looking statement. One reason is that it leads us naturally to the following line of thought. We are trying to prove something about the image of the function <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> (the function from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_p' title='&#92;mathbb{Z}_p' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_p' title='&#92;mathbb{Z}_p' class='latex' /> that takes <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=ax&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ax' title='ax' class='latex' />). The statement we want to prove is that the image of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is the whole of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_p' title='&#92;mathbb{Z}_p' class='latex' />, but instead of trying to do that all in one go, let&#8217;s use another problem-solving strategy.</p>
<p>12. <em>If you don&#8217;t immediately see why an object has a certain property, then just see what you can say about that object: it may help.</em></p>
<p>What can we say about the image of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' />? We can write it down if we want: it&#8217;s the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B0%2Ca%2C2a%2C3a%2C%5Cdots%2C%28p-1%29a%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{0,a,2a,3a,&#92;dots,(p-1)a&#92;}' title='&#92;{0,a,2a,3a,&#92;dots,(p-1)a&#92;}' class='latex' />, where all those numbers are to be interpreted mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. In other words, it&#8217;s the set of all multiples of <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />. </p>
<p>We don&#8217;t at this point know that all those numbers are distinct mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, or we&#8217;d have our injection/surjection, but what can we say about the set? Well, it&#8217;s obviously closed under addition mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> &#8212; if you add two multiples of <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> together you&#8217;ll get another one. But hang on, <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_p' title='&#92;mathbb{Z}_p' class='latex' /> forms a group under addition mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> and we&#8217;ve got a subset that&#8217;s closed under addition. Could it be a subgroup? Indeed it could: the inverse of <img src='http://s0.wp.com/latex.php?latex=ra&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ra' title='ra' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=%28p-r%29a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(p-r)a' title='(p-r)a' class='latex' /> so it&#8217;s closed under taking inverses as well. (You may have had it pointed out to you that for subsets of <em>finite</em> groups it&#8217;s enough to show that they are closed under the group operation: you can get the inverse of <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> by finding <img src='http://s0.wp.com/latex.php?latex=r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r' title='r' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=g%5Er%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g^r=e' title='g^r=e' class='latex' /> and taking <img src='http://s0.wp.com/latex.php?latex=g%5E%7Br-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g^{r-1}' title='g^{r-1}' class='latex' />.)</p>
<p>But the group <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_p' title='&#92;mathbb{Z}_p' class='latex' /> is cyclic of prime order, and we know all about subgroups of those: they are trivial. Why? Because Lagrange&#8217;s theorem tells us they must have size 1 or <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, and that forces them to be <img src='http://s0.wp.com/latex.php?latex=%5C%7B0%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{0&#92;}' title='&#92;{0&#92;}' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_p' title='&#92;mathbb{Z}_p' class='latex' />. Since <img src='http://s0.wp.com/latex.php?latex=a%5Cnot%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;not&#92;equiv 0' title='a&#92;not&#92;equiv 0' class='latex' />, it follows that the set of multiples of <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />, or equivalently the subgroup generated by <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />, is the whole of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_p' title='&#92;mathbb{Z}_p' class='latex' />. </p>
<p>As I hope you&#8217;ve seen, we&#8217;re done. We&#8217;ve just shown that the image of the map is the whole of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_p' title='&#92;mathbb{Z}_p' class='latex' />, which tells us that it&#8217;s a surjection, which tells us that it&#8217;s an injection, which tells us that we have the cancellation law, which tells us that if <img src='http://s0.wp.com/latex.php?latex=ab%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab&#92;equiv 0' title='ab&#92;equiv 0' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=a%5Cnot%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;not&#92;equiv 0' title='a&#92;not&#92;equiv 0' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=b%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;equiv 0' title='b&#92;equiv 0' class='latex' />. </p>
<p><strong>Conclusion.</strong></p>
<p>It&#8217;s possible to give other accounts of how one might arrive at a proof of the fundamental theorem of arithmetic. Indeed, <a href="http://www.dpmms.cam.ac.uk/~wtg10/FTA.html">I wrote one myself</a> several years ago that is similar to this one up to the point of formulating the lemma that if <img src='http://s0.wp.com/latex.php?latex=p%7Cab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|ab' title='p|ab' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> divides one of <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' />, but that diverges from it thereafter, ending up with the more usual proof via Euclid&#8217;s algorithm. (I deliberately didn&#8217;t read that one before writing this, and I find it quite interesting to see how similar much of what I&#8217;ve written now is to what I wrote then.)</p>
<p>I worry sometimes that accounts like this of how a proof might be discovered can be off-puttingly long. So it&#8217;s important to stress that the actual <em>proofs</em> are much much shorter. Here&#8217;s how the proof that the above thoughts lead to ends up. I&#8217;ll just do the uniqueness part, and I&#8217;ll write the whole thing in logical order, which is more or less the reverse of the order in which one discovers the steps.</p>
<p><strong>Proof that prime factorizations are unique.</strong></p>
<p><strong>Lemma 1.</strong> <em>Let <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> be a prime and let <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> be an integer not congruent to <img src='http://s0.wp.com/latex.php?latex=0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='0' title='0' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. Then the equation <img src='http://s0.wp.com/latex.php?latex=ax%5Cequiv+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ax&#92;equiv 1' title='ax&#92;equiv 1' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> has a solution.</em></p>
<p><strong>Proof.</strong> Define a function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BZ%7D_p%5Cto%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{Z}_p&#92;to&#92;mathbb{Z}_p' title='f:&#92;mathbb{Z}_p&#92;to&#92;mathbb{Z}_p' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dax&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=ax' title='f(x)=ax' class='latex' />. Then the image of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a subgroup of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_p' title='&#92;mathbb{Z}_p' class='latex' /> (since <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a homomorphism). Since it contains <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />, it is not the subgroup <img src='http://s0.wp.com/latex.php?latex=%5C%7B0%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{0&#92;}' title='&#92;{0&#92;}' class='latex' />, so by Lagrange&#8217;s theorem it must be all of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D_p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}_p' title='&#92;mathbb{Z}_p' class='latex' />. In particular, it contains 1. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p><strong>Corollary 2.</strong> <em>Let <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> be a prime and let <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> be integers such that <img src='http://s0.wp.com/latex.php?latex=p%7Cab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|ab' title='p|ab' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=p%7Ca&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|a' title='p|a' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=p%7Cb&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|b' title='p|b' class='latex' />.</em></p>
<p><strong>Proof.</strong> If <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> is not a multiple of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> then by Lemma 1 we can find <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=ca%5Cequiv+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ca&#92;equiv 1' title='ca&#92;equiv 1' class='latex' /> mod <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. It follows that <img src='http://s0.wp.com/latex.php?latex=b%5Cequiv+%28ca%29b%5Cequiv+c%28ab%29%5Cequiv+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;equiv (ca)b&#92;equiv c(ab)&#92;equiv 0' title='b&#92;equiv (ca)b&#92;equiv c(ab)&#92;equiv 0' class='latex' />. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p><strong>Corollary 3.</strong> <em>Let <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> be a prime and let <img src='http://s0.wp.com/latex.php?latex=a_1%2C%5Cdots%2Ca_k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_1,&#92;dots,a_k' title='a_1,&#92;dots,a_k' class='latex' /> be integers such that <img src='http://s0.wp.com/latex.php?latex=p%7Ca_1%5Cdots+a_k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|a_1&#92;dots a_k' title='p|a_1&#92;dots a_k' class='latex' />. Then there exists <img src='http://s0.wp.com/latex.php?latex=i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='i' title='i' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=p%7Ca_i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|a_i' title='p|a_i' class='latex' />.</em></p>
<p><strong>Proof.</strong> Induction on <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />. We&#8217;ve proved it if <img src='http://s0.wp.com/latex.php?latex=k%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=2' title='k=2' class='latex' />. For larger <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> we know by the <img src='http://s0.wp.com/latex.php?latex=k%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=2' title='k=2' class='latex' /> case that either <img src='http://s0.wp.com/latex.php?latex=p%7Ca_1%5Cdots+a_%7Bk-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|a_1&#92;dots a_{k-1}' title='p|a_1&#92;dots a_{k-1}' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=p%7Ca_k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|a_k' title='p|a_k' class='latex' />. In the second case we are done and in the first case we are done by induction. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p><strong>Proof of uniqueness.</strong> Suppose that <img src='http://s0.wp.com/latex.php?latex=p_1%5Cdots+p_r%3Dq_1%5Cdots+q_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1&#92;dots p_r=q_1&#92;dots q_s' title='p_1&#92;dots p_r=q_1&#92;dots q_s' class='latex' />, that the two products are not the same even after reordering, and that <img src='http://s0.wp.com/latex.php?latex=r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r' title='r' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='s' title='s' class='latex' /> are minimal such that this can be done. Then no <img src='http://s0.wp.com/latex.php?latex=p_i&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_i' title='p_i' class='latex' /> can equal any <img src='http://s0.wp.com/latex.php?latex=q_j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_j' title='q_j' class='latex' />, or we could divide through and get an example with <img src='http://s0.wp.com/latex.php?latex=r-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r-1' title='r-1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=s-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='s-1' title='s-1' class='latex' />. </p>
<p>Since the <img src='http://s0.wp.com/latex.php?latex=q_j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_j' title='q_j' class='latex' /> are prime, it follows that <img src='http://s0.wp.com/latex.php?latex=p_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1' title='p_1' class='latex' /> does not divide any of them. But then, by Corollary 3, it does not divide their product <img src='http://s0.wp.com/latex.php?latex=q_1%5Cdots+q_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q_1&#92;dots q_s' title='q_1&#92;dots q_s' class='latex' />. But it does divide <img src='http://s0.wp.com/latex.php?latex=p_1%5Cdots+p_r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1&#92;dots p_r' title='p_1&#92;dots p_r' class='latex' />, contradicting the assumption that the two products are equal. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3753/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3753/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3753/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3753/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3753/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3753/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3753/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3753&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/11/18/proving-the-fundamental-theorem-of-arithmetic/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Why isn&#8217;t the fundamental theorem of arithmetic obvious?</title>
		<link>http://gowers.wordpress.com/2011/11/13/why-isnt-the-fundamental-theorem-of-arithmetic-obvious/</link>
		<comments>http://gowers.wordpress.com/2011/11/13/why-isnt-the-fundamental-theorem-of-arithmetic-obvious/#comments</comments>
		<pubDate>Sun, 13 Nov 2011 11:33:12 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[IA Numbers and Sets]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3741</guid>
		<description><![CDATA[The fundamental theorem of arithmetic states that every positive integer can be factorized in one way as a product of prime numbers. This statement has to be appropriately interpreted: we count the factorizations and as the same, for instance. Note that it is essential not to count 1 as a prime, or else we could [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3741&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The fundamental theorem of arithmetic states that every positive integer can be factorized in one way as a product of prime numbers. This statement has to be appropriately interpreted: we count the factorizations <img src='http://s0.wp.com/latex.php?latex=3%5Ctimes+5%5Ctimes+13&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3&#92;times 5&#92;times 13' title='3&#92;times 5&#92;times 13' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=13%5Ctimes+3%5Ctimes+5&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='13&#92;times 3&#92;times 5' title='13&#92;times 3&#92;times 5' class='latex' /> as the same, for instance. Note that it is essential not to count 1 as a prime, or else we could stick a product of 1s on to the end of any factorization to get a different one: <img src='http://s0.wp.com/latex.php?latex=3%5Ctimes+5%5Ctimes+13%3D3%5Ctimes+5%5Ctimes+13%5Ctimes+1%5Ctimes+1%5Ctimes+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3&#92;times 5&#92;times 13=3&#92;times 5&#92;times 13&#92;times 1&#92;times 1&#92;times 1' title='3&#92;times 5&#92;times 13=3&#92;times 5&#92;times 13&#92;times 1&#92;times 1&#92;times 1' class='latex' />. But doesn&#8217;t that mean that 1 itself cannot be written as a product of primes? No &#8212; we define the &#8220;empty product&#8221; (what you get when you take a bunch of &#8230; no numbers at all and multiply them together) to be 1. That is a sensible convention because we would like multiplying a product of numbers by the empty product not to make any change to the result.<br />
<span id="more-3741"></span></p>
<p>That&#8217;s enough about what the fundamental theorem of arithmetic says. In this post I want to discuss the question of why it is a theorem at all. Isn&#8217;t it more like an <em>observation</em>? After all, given any number, we can simply work out its prime factorization.</p>
<p><strong>Answer 1.</strong> <em>If you think it&#8217;s obvious, then you&#8217;re probably assuming what you need to prove.</em></p>
<p>If you say, &#8220;we can simply work out its prime factorization,&#8221; you are already assuming that that factorization is unique. Otherwise, you would have had to say, &#8220;we can simply work out a prime factorization for it&#8221;. Of course, if you say it that way, it suddenly doesn&#8217;t seem quite as obvious that there&#8217;s only one. If you&#8217;re trying to argue that it&#8217;s obvious and you ever utter the phrase, &#8220;the prime factorization,&#8221; then you are begging the question, since implicit in those words is the assertion that there is only one prime factorization.</p>
<p><strong>Answer 2.</strong> <em>Just because you&#8217;ve got a completely deterministic method for working out a prime factorization, that doesn&#8217;t mean what you work out is the only prime factorization.</em></p>
<p>The following method is probably how you factorize a number: you divide it by 2 as many times as you can (which may be no times at all), then by 3, then by 5, and so on, keeping track of what you&#8217;ve done. For example, if your starting number is 575, then you can&#8217;t divide it by 2 or 3, but you can divide it by 5 to get 115, and then by 5 again to get 23, and then &#8230; well, you&#8217;ll probably know that 23 is prime, but you could also argue that since <img src='http://s0.wp.com/latex.php?latex=5%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='5^2' title='5^2' class='latex' /> is greater than 23 and you&#8217;ve checked 2 and 3, then it <em>must</em> be prime. </p>
<p>But just because that <em>method</em> always gives the same answer (the method in the abstract being to keep dividing by the smallest prime that goes into the number you see in front of you), that doesn&#8217;t mean that there might not be some <em>other</em> method that gives a different answer. For example, what if you looked for the <em>largest</em> prime that went into your number? You&#8217;re probably thinking that you&#8217;ll just get the same list of primes, but written backwards. But how do you <em>know</em> this? Obviously that&#8217;s what you&#8217;ll get if there&#8217;s only one way of writing the number as a product of primes, but that&#8217;s what we&#8217;re trying to prove. If there&#8217;s another way of writing it as a product of primes, then perhaps the largest prime in the other way of doing things is larger than the largest prime that results from the usual method.</p>
<p><strong>Answer 3.</strong> <em>Look, it just bloody well isn&#8217;t obvious, OK?</em></p>
<p>Sorry, I lost it for a moment there. But if you persist in thinking that it&#8217;s obvious, then perhaps you can tell me why it is obvious that <img src='http://s0.wp.com/latex.php?latex=23%5Ctimes+1759&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='23&#92;times 1759' title='23&#92;times 1759' class='latex' /> is not the same number as <img src='http://s0.wp.com/latex.php?latex=53%5Ctimes+769&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='53&#92;times 769' title='53&#92;times 769' class='latex' />. I&#8217;ll save you a little time by revealing that all of 23, 53, 769 and 1759 are prime. I will not accept as an answer that if you calculate those two products you get different results. That to me is an admission that it <em>wasn&#8217;t</em> obvious that the answers would be different. If it was obvious, then why bother to calculate them?</p>
<p>By the way, I&#8217;ll grant you that sometimes it&#8217;s obvious that two products of primes are different. For example, if 2 is involved in one product and not the other, then the first product is even and the second is odd. However, even that second assertion depends on the (simple) result that a product of odd numbers is odd. We&#8217;d be able to see instantly that <img src='http://s0.wp.com/latex.php?latex=23%5Ctimes+1759%5Cne+53%5Ctimes+769&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='23&#92;times 1759&#92;ne 53&#92;times 769' title='23&#92;times 1759&#92;ne 53&#92;times 769' class='latex' /> if we knew that a product of two non-multiples of 23 was always a non-multiple of 23. But is there an easy way of showing that? We can work out the multiplication table mod 23, but that&#8217;s a bit tedious.  Alternatively, we can use some theory from the course &#8212; but unless you&#8217;re finding the course so easy that that theory (a proof derived from Euclid&#8217;s algorithm) is utterly obvious, then I don&#8217;t think you can call it obvious that <img src='http://s0.wp.com/latex.php?latex=23%5Ctimes+1759%5Cne+53%5Ctimes+769&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='23&#92;times 1759&#92;ne 53&#92;times 769' title='23&#92;times 1759&#92;ne 53&#92;times 769' class='latex' />.</p>
<p>Here&#8217;s another pair of products of primes for your delectation and delight: <img src='http://s0.wp.com/latex.php?latex=47%5Ctimes+863&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='47&#92;times 863' title='47&#92;times 863' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=73%5Ctimes+557&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='73&#92;times 557' title='73&#92;times 557' class='latex' />. Are they obviously different? It&#8217;s not clear which is bigger &#8212; they&#8217;re both a little over 40,000. What about the last digit? Damn, 1 in both cases. OK, let&#8217;s go for the second last digit, which is a bit of a cheat but still. In the first case we get the last digit of <img src='http://s0.wp.com/latex.php?latex=4%5Ctimes+3%2B7%5Ctimes+6%2B2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4&#92;times 3+7&#92;times 6+2' title='4&#92;times 3+7&#92;times 6+2' class='latex' />, which is 6. In the second case we get the last digit of <img src='http://s0.wp.com/latex.php?latex=7%5Ctimes+7%2B3%5Ctimes+5%2B2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='7&#92;times 7+3&#92;times 5+2' title='7&#92;times 7+3&#92;times 5+2' class='latex' />, which is again 6. So we&#8217;ve got two numbers that are a little bit above 40,000 that both end 61. As it happens <img src='http://s0.wp.com/latex.php?latex=47%5Ctimes+863%3D40561&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='47&#92;times 863=40561' title='47&#92;times 863=40561' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=73%5Ctimes+557%3D40661&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='73&#92;times 557=40661' title='73&#92;times 557=40661' class='latex' />.</p>
<p>If you wanted a quicker demonstration that those two numbers are different, you could work out what they are mod 3, which is a lot easier than working them out completely. But that&#8217;s not going to work in general. For instance, it doesn&#8217;t work for the first example, where the smallest modulus for which they differ is 7. If I took two pairs of absolutely huge numbers (with millions of digits), I could get them to agree in almost all their digits and differ by a multiple of, say, 1000! And even if such small-modulus tests can be used, it isn&#8217;t obvious in advance that they will work if it isn&#8217;t obvious in advance that the products are different.</p>
<p><strong>Answer 4.</strong> <em>If it&#8217;s so obvious that every number has a unique factorization, then why is the corresponding statement false in a similar context?</em></p>
<p>Consider the collection of all numbers of the form <img src='http://s0.wp.com/latex.php?latex=a%2Bb%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a+b&#92;sqrt{-5}' title='a+b&#92;sqrt{-5}' class='latex' /> where <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> are integers. (You might prefer to write these numbers as <img src='http://s0.wp.com/latex.php?latex=a%2Bib%5Csqrt%7B5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a+ib&#92;sqrt{5}' title='a+ib&#92;sqrt{5}' class='latex' />, but I prefer <img src='http://s0.wp.com/latex.php?latex=%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sqrt{-5}' title='&#92;sqrt{-5}' class='latex' /> for reasons that I don&#8217;t want to go into here, but might mention in a future post.)</p>
<p>These numbers have various properties in common with the integers: you can add them and multiply them, there are identities for both addition and multiplication, and every number has an additive inverse. And as with integers, if you divide one by another, you don&#8217;t always get a third, so the notion of divisibility makes sense too. That means that we could if we wanted try to define a notion of a &#8220;prime&#8221; number of the form <img src='http://s0.wp.com/latex.php?latex=a%2Bb%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a+b&#92;sqrt{-5}' title='a+b&#92;sqrt{-5}' class='latex' />. </p>
<p>Just before I try to do that, let&#8217;s quickly decide what we mean by a prime when we allow negative numbers. Presumably we&#8217;re going to want, say, <img src='http://s0.wp.com/latex.php?latex=-5&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='-5' title='-5' class='latex' /> to be a prime, but what definition will lead to that? The small technical obstacle we face is that if we allow negative primes like that, then for a somewhat silly reason factorizations won&#8217;t be unique: for instance, <img src='http://s0.wp.com/latex.php?latex=15%3D3%5Ctimes+5%3D%28-3%29%5Ctimes%28-5%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='15=3&#92;times 5=(-3)&#92;times(-5)' title='15=3&#92;times 5=(-3)&#92;times(-5)' class='latex' />. The usual approach to this is to divide numbers into three kinds: prime numbers, composite numbers, and <em>units</em>. A unit is a number that has a multiplicative inverse, so in <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}' title='&#92;mathbb{Z}' class='latex' /> the units are 1 and -1. A prime is a number that cannot be written in the form <img src='http://s0.wp.com/latex.php?latex=ab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab' title='ab' class='latex' /> unless exactly one of <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> is a unit. (I said &#8220;exactly&#8221; one because I didn&#8217;t want accidentally to define units themselves to be primes.) And now we can express the fundamental theorem of arithmetic in <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}' title='&#92;mathbb{Z}' class='latex' /> by saying that every number has exactly one factorization into primes, except that we count two factorizations as the same if the only difference (apart from the order) is that the primes in one factorization are multiplied by units to give the primes in the other factorization. For example, we count <img src='http://s0.wp.com/latex.php?latex=3%5Ctimes+5&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3&#92;times 5' title='3&#92;times 5' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%28-5%29%5Ctimes%28-3%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(-5)&#92;times(-3)' title='(-5)&#92;times(-3)' class='latex' /> as the same, since we can reorder the second factorization as <img src='http://s0.wp.com/latex.php?latex=%28-3%29%5Ctimes%28-5%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(-3)&#92;times(-5)' title='(-3)&#92;times(-5)' class='latex' /> and then multiply both primes by the unit -1 to get <img src='http://s0.wp.com/latex.php?latex=3%5Ctimes+5&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3&#92;times 5' title='3&#92;times 5' class='latex' />, which gives us the first factorization.</p>
<p>In short, what we&#8217;re saying is that if two products of primes don&#8217;t obviously give the same number, then they give different numbers.</p>
<p>Right, back to numbers of the form <img src='http://s0.wp.com/latex.php?latex=a%2Bb%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a+b&#92;sqrt{-5}' title='a+b&#92;sqrt{-5}' class='latex' />. Let&#8217;s check that 2 is a prime in this ring. (A <em>ring</em> in this context is, roughly speaking, an algebraic structure with addition and multiplication with all the usual axioms apart from the existence of multiplicative inverses. You can think of it as something a bit like <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}' title='&#92;mathbb{Z}' class='latex' />. However, the actual definition is a bit more general, as you can find out from <a href="http://en.wikipedia.org/wiki/Ring_(mathematics)">the relevant Wikipedia article</a>. Hmm, I&#8217;ve just looked at that article and I don&#8217;t like it at all: the list of examples is woefully inadequate. The important examples are eventually mentioned, but not in the list of basic examples, so you don&#8217;t get a good idea that they are the important ones.) First of all, 2 isn&#8217;t a unit, since <img src='http://s0.wp.com/latex.php?latex=1%2F2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1/2' title='1/2' class='latex' /> is not of the form <img src='http://s0.wp.com/latex.php?latex=a%2Bb%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a+b&#92;sqrt{-5}' title='a+b&#92;sqrt{-5}' class='latex' />. The modulus of <img src='http://s0.wp.com/latex.php?latex=a%2Bb%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a+b&#92;sqrt{-5}' title='a+b&#92;sqrt{-5}' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=%5Csqrt%7Ba%5E2%2B5b%5E2%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sqrt{a^2+5b^2}' title='&#92;sqrt{a^2+5b^2}' class='latex' />, so if <img src='http://s0.wp.com/latex.php?latex=b%5Cne+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;ne 0' title='b&#92;ne 0' class='latex' />, then the modulus of <img src='http://s0.wp.com/latex.php?latex=a%2Bb%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a+b&#92;sqrt{-5}' title='a+b&#92;sqrt{-5}' class='latex' /> is bigger than 2. It follows that the only way of writing 2 as a product of non-units would have to be to write it as a product of non-unit integers, which we can&#8217;t. So 2 is prime. </p>
<p>A similar check can be run for 3. So 2 and 3 are primes. It&#8217;s also possible to show that <img src='http://s0.wp.com/latex.php?latex=1%2B%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1+&#92;sqrt{-5}' title='1+&#92;sqrt{-5}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=1-%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1-&#92;sqrt{-5}' title='1-&#92;sqrt{-5}' class='latex' /> are primes. But <img src='http://s0.wp.com/latex.php?latex=2%5Ctimes+3%3D%281%2B%5Csqrt%7B-5%7D%29%281-%5Csqrt%7B-5%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2&#92;times 3=(1+&#92;sqrt{-5})(1-&#92;sqrt{-5})' title='2&#92;times 3=(1+&#92;sqrt{-5})(1-&#92;sqrt{-5})' class='latex' />, so 6 has a non-unique factorization into primes. (It&#8217;s also easy to see that you can&#8217;t multiply <img src='http://s0.wp.com/latex.php?latex=2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2' title='2' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3' title='3' class='latex' /> by a unit to get one of <img src='http://s0.wp.com/latex.php?latex=1%5Cpm%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1&#92;pm&#92;sqrt{-5}' title='1&#92;pm&#92;sqrt{-5}' class='latex' />.)</p>
<p>Why is this a problem for people who hold that the fundamental theorem of arithmetic is obvious? It&#8217;s because they have to explain what it is about <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}' title='&#92;mathbb{Z}' class='latex' /> that is relevantly different from the ring of numbers of the form <img src='http://s0.wp.com/latex.php?latex=a%2Bb%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a+b&#92;sqrt{-5}' title='a+b&#92;sqrt{-5}' class='latex' />, which is denoted <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D%28%5Csqrt%7B-5%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}(&#92;sqrt{-5})' title='&#92;mathbb{Z}(&#92;sqrt{-5})' class='latex' />. Why can&#8217;t we just translate any proof that works for <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}' title='&#92;mathbb{Z}' class='latex' /> into a proof that works for <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D%28%5Csqrt%7B-5%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}(&#92;sqrt{-5})' title='&#92;mathbb{Z}(&#92;sqrt{-5})' class='latex' />? </p>
<p>Here&#8217;s an example of how you can use <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D%28%5Csqrt%7B-5%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}(&#92;sqrt{-5})' title='&#92;mathbb{Z}(&#92;sqrt{-5})' class='latex' /> to defeat somebody who claims that the result is obvious in <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}' title='&#92;mathbb{Z}' class='latex' />. Let&#8217;s take the argument that you can just work the factorization out by repeatedly dividing by the smallest prime that goes into your number. Well, you can do that in <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D%28%5Csqrt%7B-5%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}(&#92;sqrt{-5})' title='&#92;mathbb{Z}(&#92;sqrt{-5})' class='latex' /> as well. Take 6, for instance. The smallest prime (in the sense of having smallest modulus) that goes into 6 is 2. Dividing by 2 we get 3, which is prime. So we&#8217;re done. So there can&#8217;t be another factorization. Except that there <em>is</em> another factorization. So the argument just isn&#8217;t an argument.</p>
<p>In a future post I&#8217;ll discuss the proof of the fundamental theorem of arithmetic. But this post is just to try to convince you (if you needed convincing, which you may not have) that the result is worth going to some effort to prove. </p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3741/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3741/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3741/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3741/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3741/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3741/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3741/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3741/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3741/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3741/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3741/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3741/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3741/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3741/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3741&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/11/13/why-isnt-the-fundamental-theorem-of-arithmetic-obvious/feed/</wfw:commentRss>
		<slash:comments>26</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Group actions II: the orbit-stabilizer theorem</title>
		<link>http://gowers.wordpress.com/2011/11/09/group-actions-ii-the-orbit-stabilizer-theorem/</link>
		<comments>http://gowers.wordpress.com/2011/11/09/group-actions-ii-the-orbit-stabilizer-theorem/#comments</comments>
		<pubDate>Wed, 09 Nov 2011 11:23:33 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[IA Groups]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3720</guid>
		<description><![CDATA[How many rotational symmetries does a cube have? This question can be answered in a number of ways. Perhaps the one that most readily occurs to people is this: each vertex can end up in one of eight places; once you&#8217;ve decided where to put it, there are three places you can put one of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3720&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>How many rotational symmetries does a cube have? This question can be answered in a number of ways. Perhaps the one that most readily occurs to people is this: each vertex can end up in one of eight places; once you&#8217;ve decided where to put it, there are three places you can put one of its neighbours; once you&#8217;ve decided where to put that, the rotation is determined, so the total number of rotations is <img src='http://s0.wp.com/latex.php?latex=8%5Ctimes+3%3D24&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='8&#92;times 3=24' title='8&#92;times 3=24' class='latex' />.</p>
<p>Here&#8217;s another proof. Take one of the faces. It can go to one of six other faces, and once you&#8217;ve decided which face it will go to, one of the vertices on the face has four places it can go, and once you&#8217;ve decided that you&#8217;ve fixed the rotation. So the total number of rotations is <img src='http://s0.wp.com/latex.php?latex=6%5Ctimes+4%3D24&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='6&#92;times 4=24' title='6&#92;times 4=24' class='latex' />.</p>
<p>And here&#8217;s another. Take one of the midpoints of the twelve edges. There are twelve places it can end up, and once you&#8217;ve decided where to put it, there are two choices for how you send the two endpoints of the original edge to the endpoints of the new edge. So the total number of rotations is <img src='http://s0.wp.com/latex.php?latex=12%5Ctimes+2%3D24&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='12&#92;times 2=24' title='12&#92;times 2=24' class='latex' />.<br />
<span id="more-3720"></span></p>
<p>And here&#8217;s yet another. Take a point half way between the centre of one of the faces and one of the vertices. (If the cube is the set of all points <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%2Cz%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y,z)' title='(x,y,z)' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=x%2C+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x, y' title='x, y' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=z&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='z' title='z' class='latex' /> all lie in the closed interval <img src='http://s0.wp.com/latex.php?latex=%5B-1%2C1%5D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='[-1,1]' title='[-1,1]' class='latex' />, then we could, for instance, take the point <img src='http://s0.wp.com/latex.php?latex=%281%2F2%2C1%2F2%2C1%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1/2,1/2,1)' title='(1/2,1/2,1)' class='latex' />.) There are 24 places that that point can end up &#8212; the 24 points that are half way between the centre of some face and one of the vertices of that face. There is only one rotational symmetry that takes the original point to any given one of those 24 points. Therefore, the number of rotations of the cube is <img src='http://s0.wp.com/latex.php?latex=24%5Ctimes+1%3D24&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='24&#92;times 1=24' title='24&#92;times 1=24' class='latex' />.</p>
<p>If you understood those arguments, then you basically understand the orbit-stabilizer theorem, even if you think you don&#8217;t. To try to convince you of that, I&#8217;ll first show how to convert one of the proofs into a proof that explicitly uses the orbit-stabilizer theorem. I&#8217;ll then discuss how much you have to remember in order to remember the proof of the orbit-stabilizer theorem. As usual, I shall aim to show that the answer is &#8220;very little indeed&#8221;. However, routine proofs in abstract algebra have a somewhat different flavour from routine proofs in set theory and analysis, and as this one is a good representative example, it bears discussing in some detail. (Later I may try the same with the first isomorphism theorem, which you should be meeting fairly soon.)</p>
<p>To illustrate how to count symmetries using the orbit-stabilizer theorem, let me convert the <img src='http://s0.wp.com/latex.php?latex=6%5Ctimes+4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='6&#92;times 4' title='6&#92;times 4' class='latex' /> proof that used faces. The obvious way to capture the idea of &#8220;which face a particular face goes to&#8221; is to let the symmetry group of the cube act on the faces of the cube. Then the set of places that a particular face can go to is nothing other than the orbit of that face. How does the group act on the faces? Well, any rotation of the cube will permute the faces, and that permutation is the one that corresponds to the rotation.</p>
<p>In the earlier version of the argument, we said that our chosen face can go to any one of six other faces. The fancy way of saying that is that the orbit of the chosen face is the set of all faces, which has size 6. </p>
<p>And in the earlier version of the argument, we said that once we had decided which face our chosen face would go to, there were four ways of lining up the vertices. This doesn&#8217;t quite correspond to a statement about stabilizers, but let us note for now that if the chosen face maps to <em>itself</em>, then we have four choices. So the stabilizer of our chosen face has size 4. Multiplying those two numbers together gives us 24. </p>
<p>The content of the orbit-stabilizer theorem is basically that if a group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> acts on a set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />, then once you know how many ways there are of sending an element <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> to itself, you also know how many ways there are of sending it to any other element in its orbit. In the faces-of-the-cube example, knowing that there are four ways of rotating the cube so that a face <img src='http://s0.wp.com/latex.php?latex=F&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F' title='F' class='latex' /> ends up where it was before tells us that there are also four ways of sending it to some other face <img src='http://s0.wp.com/latex.php?latex=F%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F&#039;' title='F&#039;' class='latex' />. And that is because &#8220;all faces look basically the same&#8221; from the point of view of the rotation group. In general, &#8220;all points of the orbit of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> look basically the same&#8221; from the point of view of the transformations performed by the elements of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />.</p>
<p>I&#8217;m going to give a few different proofs of the orbit-stabilizer theorem, not because they are fundamentally different, but because they have certain stylistic features that are worth commenting on. Also, it may be interesting to see how much variety is possible even when presenting a proof, even when the underlying argument is the same. </p>
<p>First, I&#8217;ll remind you of the statement we are trying to prove.</p>
<p><strong>Theorem.</strong> <em>Let <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> be a group that acts on a set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />, let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> be an element of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />, let <img src='http://s0.wp.com/latex.php?latex=O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O_x' title='O_x' class='latex' /> be the orbit of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and let <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' /> be the stabilizer of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=%7CO_x%7C%7CS_x%7C%3D%7CG%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|O_x||S_x|=|G|' title='|O_x||S_x|=|G|' class='latex' />.</em></p>
<p>In words, the size of the group is the size of the orbit times the size of the stabilizer.</p>
<p><strong>Proof 1.</strong> For this proof I want to try to capture as closely as possible our intuition in the cube example that the set of rotations that takes one face to another is &#8220;just like the set of rotations that fix that face, but over at another place&#8221;. However, I&#8217;ll argue in general.</p>
<p>What am I trying to show in the abstract? I want to show that if <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> belongs to the orbit of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />, then the set of <img src='http://s0.wp.com/latex.php?latex=g%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in G' title='g&#92;in G' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=gx%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gx=y' title='gx=y' class='latex' /> is the same size as <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' />, which is the set of <img src='http://s0.wp.com/latex.php?latex=g%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in G' title='g&#92;in G' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=gx%3Dx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gx=x' title='gx=x' class='latex' />. (Here I&#8217;m slipping into the condensed notation and writing <img src='http://s0.wp.com/latex.php?latex=gx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gx' title='gx' class='latex' /> instead of <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%2Cx%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g,x)' title='&#92;phi(g,x)' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)(x)' title='&#92;phi(g)(x)' class='latex' />. But it&#8217;s important to bear in mind that the &#8220;product&#8221; <img src='http://s0.wp.com/latex.php?latex=gx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gx' title='gx' class='latex' /> is not the group operation, since it&#8217;s only <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> that belongs to a group. Rather, it is what one might call the group action operation, which has some nice group-like properties such as <img src='http://s0.wp.com/latex.php?latex=%28gh%29x%3Dg%28hx%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(gh)x=g(hx)' title='(gh)x=g(hx)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=ex%3Dx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ex=x' title='ex=x' class='latex' />.)</p>
<p>What is the most natural way to show that two sets have the same size? It&#8217;s to produce a bijection between them. (It isn&#8217;t the only way. Another is to calculate their sizes, possibly by completely different methods, and show that you get the same answer in both cases. But that is not an appropriate method here, since we&#8217;re not given enough information to calculate anything.)</p>
<p>Let me write <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' /> for the set of <img src='http://s0.wp.com/latex.php?latex=g%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in G' title='g&#92;in G' class='latex' /> that take <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />. If I&#8217;m given an element of <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' />, how do I turn it into an element of <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' />? Well, in the faces-of-cube example, what I might do is this. I fix, once and for all, a rotation <img src='http://s0.wp.com/latex.php?latex=%5Crho&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;rho' title='&#92;rho' class='latex' /> that takes my initial face to the given face. And then, given a rotation that fixes the initial face, I compose it with <img src='http://s0.wp.com/latex.php?latex=%5Crho&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;rho' title='&#92;rho' class='latex' /> to get a rotation that takes the initial face to the given face. </p>
<p>Let&#8217;s try that with <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' />. Since <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> belongs to the orbit of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> (by assumption), there is some <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=hx%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hx=y' title='hx=y' class='latex' />. Now let me define a map from <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' /> by mapping <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=hg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg' title='hg' class='latex' />. Note that <img src='http://s0.wp.com/latex.php?latex=hg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg' title='hg' class='latex' /> takes <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> first to <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> (since <img src='http://s0.wp.com/latex.php?latex=g%5Cin+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in S_x' title='g&#92;in S_x' class='latex' />) and then to <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> (since <img src='http://s0.wp.com/latex.php?latex=h%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in S_{xy}' title='h&#92;in S_{xy}' class='latex' />). Thus, <img src='http://s0.wp.com/latex.php?latex=hg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg' title='hg' class='latex' /> really does belong to <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' />.</p>
<p>However, is the map we&#8217;ve just defined a <em>bijection</em>? Let&#8217;s see if we can prove that it&#8217;s an injection and a surjection. No, let&#8217;s <em>not</em> do that. Let&#8217;s try to find an inverse. Given an element <img src='http://s0.wp.com/latex.php?latex=u&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u' title='u' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' />, how might we convert it into an element of <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' />? There&#8217;s only one thing we can even dream of trying at this point. Since <img src='http://s0.wp.com/latex.php?latex=u&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u' title='u' class='latex' /> takes <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> also takes <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=h%5E%7B-1%7Du&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h^{-1}u' title='h^{-1}u' class='latex' /> takes <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />, as we want. So the map <img src='http://s0.wp.com/latex.php?latex=u%5Cmapsto+h%5E%7B-1%7Du&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u&#92;mapsto h^{-1}u' title='u&#92;mapsto h^{-1}u' class='latex' /> takes <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' />. Does it invert the previous map? Yes of course it does.</p>
<p>We&#8217;ve shown that for each <img src='http://s0.wp.com/latex.php?latex=y%5Cin+O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in O_x' title='y&#92;in O_x' class='latex' /> there are precisely <img src='http://s0.wp.com/latex.php?latex=%7CS_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_x|' title='|S_x|' class='latex' /> elements of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> that take <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />. But every element of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> takes <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to <em>something</em> in the orbit, so <img src='http://s0.wp.com/latex.php?latex=%7CG%7C%3D%7CO_x%7C%7CS_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|=|O_x||S_x|' title='|G|=|O_x||S_x|' class='latex' />, as we were trying to prove. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p>In case you found that off-puttingly long, here it is stripped of all the accompanying chat.</p>
<p><strong>The real Proof 1.</strong> Let <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> be an arbitrary element of <img src='http://s0.wp.com/latex.php?latex=O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O_x' title='O_x' class='latex' />, and let <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D%3D%5C%7Bg%5Cin+G%3Agx%3Dy%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}=&#92;{g&#92;in G:gx=y&#92;}' title='S_{xy}=&#92;{g&#92;in G:gx=y&#92;}' class='latex' />. Pick <img src='http://s0.wp.com/latex.php?latex=h%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in S_{xy}' title='h&#92;in S_{xy}' class='latex' /> and define a map <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AS_x%5Cto+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:S_x&#92;to S_{xy}' title='&#92;phi:S_x&#92;to S_{xy}' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3Ag%5Cto+hg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:g&#92;to hg' title='&#92;phi:g&#92;to hg' class='latex' />. The map <img src='http://s0.wp.com/latex.php?latex=%5Cpsi%3AS_%7Bxy%7D%5Cto+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi:S_{xy}&#92;to S_x' title='&#92;psi:S_{xy}&#92;to S_x' class='latex' /> defined by <img src='http://s0.wp.com/latex.php?latex=%5Cpsi%3Au%5Cto+h%5E%7B-1%7Du&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi:u&#92;to h^{-1}u' title='&#92;psi:u&#92;to h^{-1}u' class='latex' /> inverts <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />, so <img src='http://s0.wp.com/latex.php?latex=%7CS_x%7C%3D%7CS_%7Bxy%7D%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_x|=|S_{xy}|' title='|S_x|=|S_{xy}|' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=y%5Cin+O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in O_x' title='y&#92;in O_x' class='latex' />. But the sets <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=y%5Cin+O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in O_x' title='y&#92;in O_x' class='latex' /> form a partition of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. It follows that <img src='http://s0.wp.com/latex.php?latex=%7CG%7C%3D%7CS_x%7C%7CO_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|=|S_x||O_x|' title='|G|=|S_x||O_x|' class='latex' />. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p>I cheated slightly there by not checking carefully that <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> really does take <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' /> and that <img src='http://s0.wp.com/latex.php?latex=%5Cpsi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi' title='&#92;psi' class='latex' /> really does take <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' />. But those facts are easy enough to be left to the reader to do in his/her head.</p>
<p>That proof is so short that one might wonder why another proof could possibly be desirable. Well, it&#8217;s mostly an aesthetic point, at least as far as this proof is concerned, but there&#8217;s something a bit ugly about choosing that <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' />. Why? Because I didn&#8217;t say how to do it. So, for example, the way I chose <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> for one <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> might be completely unrelated to the way I chose it for another <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />. Wouldn&#8217;t it be nicer if there were some natural way of <em>defining</em> <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> in terms of <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />?</p>
<p>Well, yes it would, but it&#8217;s not that easy to do. Just think of the cube example. Suppose I take a face <img src='http://s0.wp.com/latex.php?latex=F&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F' title='F' class='latex' /> and I ask you to define, for each other face <img src='http://s0.wp.com/latex.php?latex=F%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F&#039;' title='F&#039;' class='latex' />, a rotation of the cube that takes <img src='http://s0.wp.com/latex.php?latex=F&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F' title='F' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=F%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F&#039;' title='F&#039;' class='latex' />. I want you to do it in a nice systematic way. How might you do it? Well, if <img src='http://s0.wp.com/latex.php?latex=F%27%3DF&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F&#039;=F' title='F&#039;=F' class='latex' /> then probably the most natural thing to do is take the identity. How about if <img src='http://s0.wp.com/latex.php?latex=F%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F&#039;' title='F&#039;' class='latex' /> is one of the neighbouring faces to <img src='http://s0.wp.com/latex.php?latex=F&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F' title='F' class='latex' />? Perhaps the most natural thing there is to take a 90-degree rotation about an axis through the centre of the cube that&#8217;s parallel to the edge where <img src='http://s0.wp.com/latex.php?latex=F&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F' title='F' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=F%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F&#039;' title='F&#039;' class='latex' /> meet. But then what about the face opposite <img src='http://s0.wp.com/latex.php?latex=F&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F' title='F' class='latex' />? There are four ways of rotating the cube so that <img src='http://s0.wp.com/latex.php?latex=F&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F' title='F' class='latex' /> lands up at the opposite face, and they are of two kinds: two of them are half turns about axes that go through the midpoints of two opposite faces, and the other two are half turns about axes that go through the midpoints of two opposite <em>edges</em>. (If you draw a diagram, I hope you&#8217;ll be able to see what I&#8217;m talking about here. It&#8217;s one of those things that may be easier to work out for yourself.) Anyhow, for each of these two types of half turn, there is absolutely <em>nothing</em> to choose between the two half turns of that type. So there just <em>isn&#8217;t</em> a natural way to choose a rotation for each face. As mathematicians might say, there isn&#8217;t a <em>canonical choice</em>.</p>
<p>So is that the end of the story? Not quite. There&#8217;s a useful principle that sometimes applies in this kind of situation. It says this.</p>
<li><em>If you can&#8217;t make a canonical choice, then make all choices at once.</em></li>
<p>This meme was embedded in my brain as a result of <a href="http://mathoverflow.net/questions/50025/problems-where-we-cant-make-a-canonical-choice-solved-by-looking-at-all-choices">a nice Mathoverflow question</a>, though I suppose I was aware of it less consciously before that. Let&#8217;s see how it plays out with our proof.</p>
<p>If I don&#8217;t choose just one <img src='http://s0.wp.com/latex.php?latex=h%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in S_{xy}' title='h&#92;in S_{xy}' class='latex' />, then what do I do instead? I choose <em>all</em> <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> that belong to <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' />. </p>
<p>A problem then arises when I try to define a bijection from <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' />. Never mind, though. Let&#8217;s just define a map from <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D%5Ctimes+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}&#92;times S_x' title='S_{xy}&#92;times S_x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' /> by sending <img src='http://s0.wp.com/latex.php?latex=%28h%2Cg%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(h,g)' title='(h,g)' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=hg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg' title='hg' class='latex' />. Note that this is like what we did before, but we&#8217;re doing it for every <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> and not just one chosen <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' />. And what&#8217;s nice about it is that we are now doing precisely the same thing for every <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />.</p>
<p>Let&#8217;s call our map <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />. Obviously, <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is not a bijection, but that doesn&#8217;t matter. We&#8217;re trying to show that it&#8217;s a bit like a putting together of <img src='http://s0.wp.com/latex.php?latex=%7CS_%7Bxy%7D%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_{xy}|' title='|S_{xy}|' class='latex' /> bijections. So what we&#8217;d like to show is that every element of <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' /> has exactly <img src='http://s0.wp.com/latex.php?latex=%7CS_%7Bxy%7D%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_{xy}|' title='|S_{xy}|' class='latex' /> preimages under <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />. That will tell us that <img src='http://s0.wp.com/latex.php?latex=%7CS_%7Bxy%7D%7C%7CS_x%7C%3D%7CS_%7Bxy%7D%7C%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_{xy}||S_x|=|S_{xy}|^2' title='|S_{xy}||S_x|=|S_{xy}|^2' class='latex' />, which will imply that <img src='http://s0.wp.com/latex.php?latex=%7CS_%7Bxy%7D%7C%3D%7CS_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_{xy}|=|S_x|' title='|S_{xy}|=|S_x|' class='latex' />.</p>
<p>So let&#8217;s take <img src='http://s0.wp.com/latex.php?latex=u%5Cin+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u&#92;in S_x' title='u&#92;in S_x' class='latex' />. How many preimages has it got? Well, a preimage of <img src='http://s0.wp.com/latex.php?latex=u&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u' title='u' class='latex' /> is a pair <img src='http://s0.wp.com/latex.php?latex=%28h%2Cg%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(h,g)' title='(h,g)' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=hg%3Du&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg=u' title='hg=u' class='latex' />. So I want to show that there are precisely <img src='http://s0.wp.com/latex.php?latex=%7CS_%7Bxy%7D%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_{xy}|' title='|S_{xy}|' class='latex' /> solutions of the equation <img src='http://s0.wp.com/latex.php?latex=hg%3Du&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg=u' title='hg=u' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=h%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in S_{xy}' title='h&#92;in S_{xy}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g%5Cin+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in S_x' title='g&#92;in S_x' class='latex' />. But that&#8217;s easy: for each <img src='http://s0.wp.com/latex.php?latex=h%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in S_{xy}' title='h&#92;in S_{xy}' class='latex' /> there is precisely one possible <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' />, namely <img src='http://s0.wp.com/latex.php?latex=h%5E%7B-1%7Du&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h^{-1}u' title='h^{-1}u' class='latex' />, which does indeed belong to <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' />. </p>
<p>Let me again give the condensed version, just to convince you that what I&#8217;ve written doesn&#8217;t add up to a long proof. In fact, I don&#8217;t even need to bother to define the function <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />.</p>
<p><strong>Proof 2.</strong> Let <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> be an arbitrary element of <img src='http://s0.wp.com/latex.php?latex=O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O_x' title='O_x' class='latex' /> and let <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D%3D%5C%7Bg%5Cin+G%3Agx%3Dy%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}=&#92;{g&#92;in G:gx=y&#92;}' title='S_{xy}=&#92;{g&#92;in G:gx=y&#92;}' class='latex' />. For every <img src='http://s0.wp.com/latex.php?latex=u%2Ch%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u,h&#92;in S_{xy}' title='u,h&#92;in S_{xy}' class='latex' /> there is exactly one <img src='http://s0.wp.com/latex.php?latex=g%5Cin+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in S_x' title='g&#92;in S_x' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=hg%3Du&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg=u' title='hg=u' class='latex' />, namely <img src='http://s0.wp.com/latex.php?latex=h%5E%7B-1%7Du&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h^{-1}u' title='h^{-1}u' class='latex' />. It follows that every <img src='http://s0.wp.com/latex.php?latex=u%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u&#92;in S_{xy}' title='u&#92;in S_{xy}' class='latex' /> can be written in <img src='http://s0.wp.com/latex.php?latex=%7CS_%7Bxy%7D%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_{xy}|' title='|S_{xy}|' class='latex' /> ways as <img src='http://s0.wp.com/latex.php?latex=hg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg' title='hg' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=h%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in S_{xy}' title='h&#92;in S_{xy}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g%5Cin+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in S_x' title='g&#92;in S_x' class='latex' />. Also, <img src='http://s0.wp.com/latex.php?latex=hg%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg&#92;in S_{xy}' title='hg&#92;in S_{xy}' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=h%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in S_{xy}' title='h&#92;in S_{xy}' class='latex' /> and every <img src='http://s0.wp.com/latex.php?latex=g%5Cin+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in S_x' title='g&#92;in S_x' class='latex' />. It follows that <img src='http://s0.wp.com/latex.php?latex=%7CS_%7Bxy%7D%7C%7CS_x%7C%3D%7CS_%7Bxy%7D%7C%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_{xy}||S_x|=|S_{xy}|^2' title='|S_{xy}||S_x|=|S_{xy}|^2' class='latex' />. Therefore, <img src='http://s0.wp.com/latex.php?latex=%7CS_x%7C%3D%7CS_%7Bxy%7D%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_x|=|S_{xy}|' title='|S_x|=|S_{xy}|' class='latex' />. But the sets <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=y%5Cin+O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in O_x' title='y&#92;in O_x' class='latex' /> form a partition of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. It follows that <img src='http://s0.wp.com/latex.php?latex=%7CG%7C%3D%7CS_x%7C%7CO_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|=|S_x||O_x|' title='|G|=|S_x||O_x|' class='latex' />. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p>Here is a variant of the above argument that I like better.</p>
<p><strong>Proof 3.</strong> Let <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> be an arbitrary element of <img src='http://s0.wp.com/latex.php?latex=O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O_x' title='O_x' class='latex' /> and let <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D%3D%5C%7Bg%5Cin+G%3Agx%3Dy%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}=&#92;{g&#92;in G:gx=y&#92;}' title='S_{xy}=&#92;{g&#92;in G:gx=y&#92;}' class='latex' />. Let <img src='http://s0.wp.com/latex.php?latex=u%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u&#92;in S_{xy}' title='u&#92;in S_{xy}' class='latex' /> and let us see in how many ways we can write <img src='http://s0.wp.com/latex.php?latex=u%3Dhg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u=hg' title='u=hg' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=h%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in S_{xy}' title='h&#92;in S_{xy}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g%5Cin+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in S_x' title='g&#92;in S_x' class='latex' />. Well, for each <img src='http://s0.wp.com/latex.php?latex=h%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in S_{xy}' title='h&#92;in S_{xy}' class='latex' />, there is exactly one <img src='http://s0.wp.com/latex.php?latex=g%5Cin+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in S_x' title='g&#92;in S_x' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=hg%3Du&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg=u' title='hg=u' class='latex' />, namely <img src='http://s0.wp.com/latex.php?latex=h%5E%7B-1%7Du&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h^{-1}u' title='h^{-1}u' class='latex' />. So we can write <img src='http://s0.wp.com/latex.php?latex=u%3Dhg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u=hg' title='u=hg' class='latex' /> in <img src='http://s0.wp.com/latex.php?latex=%7CS_%7Bxy%7D%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_{xy}|' title='|S_{xy}|' class='latex' /> ways. But also, for each <img src='http://s0.wp.com/latex.php?latex=g%5Cin+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in S_x' title='g&#92;in S_x' class='latex' /> there is exactly one <img src='http://s0.wp.com/latex.php?latex=h%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in S_{xy}' title='h&#92;in S_{xy}' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=hg%3Du&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg=u' title='hg=u' class='latex' />, namely <img src='http://s0.wp.com/latex.php?latex=ug%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ug^{-1}' title='ug^{-1}' class='latex' />. This shows that we can write <img src='http://s0.wp.com/latex.php?latex=u%3Dhg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u=hg' title='u=hg' class='latex' /> in <img src='http://s0.wp.com/latex.php?latex=%7CS_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_x|' title='|S_x|' class='latex' /> ways. Therefore, <img src='http://s0.wp.com/latex.php?latex=%7CS_x%7C%3D%7CS_%7Bxy%7D%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_x|=|S_{xy}|' title='|S_x|=|S_{xy}|' class='latex' />. But the sets <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=y%5Cin+O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in O_x' title='y&#92;in O_x' class='latex' /> form a partition of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. It follows that <img src='http://s0.wp.com/latex.php?latex=%7CG%7C%3D%7CS_x%7C%7CO_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|=|S_x||O_x|' title='|G|=|S_x||O_x|' class='latex' />. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p>Or do I like it better? It&#8217;s starting to look, with its non-canonical choice of <img src='http://s0.wp.com/latex.php?latex=u%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u&#92;in S_{xy}' title='u&#92;in S_{xy}' class='latex' />, a bit too like the first argument.</p>
<p>I still don&#8217;t feel as though I&#8217;ve arrived at the neatest and most symmetrical argument, and I&#8217;ve also not satisfied another urge, which is to prove that <img src='http://s0.wp.com/latex.php?latex=%7CG%7C%3D%7CO_x%7C%7CS_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|=|O_x||S_x|' title='|G|=|O_x||S_x|' class='latex' /> by finding a nice bijection between <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=O_x%5Ctimes+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O_x&#92;times S_x' title='O_x&#92;times S_x' class='latex' />. There are good reasons for the second problem, as I&#8217;ve already discussed, but I might be able to satisfy my urge by finding not a one-to-one correspondence, but something a bit more general and multivalued. </p>
<p>So let&#8217;s take a point <img src='http://s0.wp.com/latex.php?latex=%28y%2Cg%29%5Cin+O_x%5Ctimes+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(y,g)&#92;in O_x&#92;times S_x' title='(y,g)&#92;in O_x&#92;times S_x' class='latex' />. What can we do with it? We can use it to define the element <img src='http://s0.wp.com/latex.php?latex=gy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gy' title='gy' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />, but that doesn&#8217;t seem very exciting or helpful. What we&#8217;d really like is to define an element of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />, which earlier we did by fixing some <img src='http://s0.wp.com/latex.php?latex=h%5Cin+S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in S_{xy}' title='h&#92;in S_{xy}' class='latex' /> and taking <img src='http://s0.wp.com/latex.php?latex=hg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg' title='hg' class='latex' />. But that felt too non-canonical, so instead we preferred to take <em>all</em> <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' />. </p>
<p>If we take all <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' />, then that&#8217;s saying that we can get from <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> in several different ways. So we seem to get a map from <img src='http://s0.wp.com/latex.php?latex=G%5Ctimes+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G&#92;times S_x' title='G&#92;times S_x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />, which is very simple: it takes <img src='http://s0.wp.com/latex.php?latex=%28h%2Cg%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(h,g)' title='(h,g)' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=hg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg' title='hg' class='latex' />. But what&#8217;s the point of it? And where does <img src='http://s0.wp.com/latex.php?latex=O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O_x' title='O_x' class='latex' /> come in? </p>
<p>One way of making <img src='http://s0.wp.com/latex.php?latex=O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O_x' title='O_x' class='latex' /> come in is to go one step further and let <img src='http://s0.wp.com/latex.php?latex=hg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hg' title='hg' class='latex' /> act on <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to create <img src='http://s0.wp.com/latex.php?latex=hgx%3Dhx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hgx=hx' title='hgx=hx' class='latex' />. So now we&#8217;ve got a map from <img src='http://s0.wp.com/latex.php?latex=G%5Ctimes+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G&#92;times S_x' title='G&#92;times S_x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O_x' title='O_x' class='latex' />, the map that takes <img src='http://s0.wp.com/latex.php?latex=%28h%2Cg%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(h,g)' title='(h,g)' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=hx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hx' title='hx' class='latex' />. How many preimages does an element <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> have? </p>
<p>I&#8217;m not going to answer that question (though I could) because once we&#8217;ve got to the point of mapping <img src='http://s0.wp.com/latex.php?latex=G%5Ctimes+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G&#92;times S_x' title='G&#92;times S_x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O_x' title='O_x' class='latex' /> using the map above, we see that <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> isn&#8217;t really playing a role, so why not just focus on the map from <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O_x' title='O_x' class='latex' /> that takes <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> (which I was calling <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> above) to <img src='http://s0.wp.com/latex.php?latex=gx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gx' title='gx' class='latex' />? If <img src='http://s0.wp.com/latex.php?latex=y%5Cin+O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in O_x' title='y&#92;in O_x' class='latex' />, then how many preimages does <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> have? </p>
<p>I won&#8217;t bother to answer that question either: it&#8217;s the same as asking how big the set <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' /> is, and we&#8217;ve established already that that is <img src='http://s0.wp.com/latex.php?latex=%7CS_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_x|' title='|S_x|' class='latex' />, so it wouldn&#8217;t be a particularly new proof that resulted. However, one remark that&#8217;s worth making is that the set <img src='http://s0.wp.com/latex.php?latex=S_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_{xy}' title='S_{xy}' class='latex' /> is a left coset of <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' />. Probably the proof you were given in lectures was this.</p>
<p><strong>Proof 4.</strong> We shall show that there is a bijection between <img src='http://s0.wp.com/latex.php?latex=O_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='O_x' title='O_x' class='latex' /> and the set of left cosets of <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' />. To do this, we map the left coset <img src='http://s0.wp.com/latex.php?latex=gS_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gS_x' title='gS_x' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=gx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gx' title='gx' class='latex' />. We must show that this is well-defined. But if <img src='http://s0.wp.com/latex.php?latex=gS_x%3DhS_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gS_x=hS_x' title='gS_x=hS_x' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=h%5E%7B-1%7DgS_x%3DS_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h^{-1}gS_x=S_x' title='h^{-1}gS_x=S_x' class='latex' />, and therefore, since <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' /> is a subgroup, <img src='http://s0.wp.com/latex.php?latex=h%5E%7B-1%7Dg%5Cin+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h^{-1}g&#92;in S_x' title='h^{-1}g&#92;in S_x' class='latex' />. From that it follows that <img src='http://s0.wp.com/latex.php?latex=h%5E%7B-1%7Dgx%3Dx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h^{-1}gx=x' title='h^{-1}gx=x' class='latex' />, so <img src='http://s0.wp.com/latex.php?latex=gx%3Dhx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gx=hx' title='gx=hx' class='latex' />. </p>
<p>Since the number of left cosets of <img src='http://s0.wp.com/latex.php?latex=S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_x' title='S_x' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=%7CG%7C%2F%7CS_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|/|S_x|' title='|G|/|S_x|' class='latex' />, we are done. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p>If the phrase &#8220;well-defined&#8221; worries you in the above argument, then I recommend <a href="http://gowers.wordpress.com/2009/06/08/why-arent-all-functions-well-defined/">a post I once wrote about what it means</a>. </p>
<p>Here&#8217;s a different way of writing the above proof, where we think more about equivalence relations than about partitions.</p>
<p><strong>Proof 5.</strong> Here are two equivalence relations on <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. For the first, I define <img src='http://s0.wp.com/latex.php?latex=g%5Csim_1h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;sim_1h' title='g&#92;sim_1h' class='latex' /> if <img src='http://s0.wp.com/latex.php?latex=gx%3Dhx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gx=hx' title='gx=hx' class='latex' />. For the second, I define <img src='http://s0.wp.com/latex.php?latex=g%5Csim_2h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;sim_2h' title='g&#92;sim_2h' class='latex' /> if <img src='http://s0.wp.com/latex.php?latex=h%5E%7B-1%7Dg%5Cin+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h^{-1}g&#92;in S_x' title='h^{-1}g&#92;in S_x' class='latex' />. Note that this is the same as saying that <img src='http://s0.wp.com/latex.php?latex=g%5E%7B-1%7Dh%5Cin+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g^{-1}h&#92;in S_x' title='g^{-1}h&#92;in S_x' class='latex' /> and also the same as saying that <img src='http://s0.wp.com/latex.php?latex=gS_x%3DhS_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gS_x=hS_x' title='gS_x=hS_x' class='latex' />. </p>
<p>Now <img src='http://s0.wp.com/latex.php?latex=gx%3Dhx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gx=hx' title='gx=hx' class='latex' /> if and only if <img src='http://s0.wp.com/latex.php?latex=h%5E%7B-1%7Dgx%3Dx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h^{-1}gx=x' title='h^{-1}gx=x' class='latex' /> if and only if <img src='http://s0.wp.com/latex.php?latex=h%5E%7B-1%7Dg%5Cin+S_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h^{-1}g&#92;in S_x' title='h^{-1}g&#92;in S_x' class='latex' />. So the two equivalence relations are the same. The number of equivalence classes for the first relation is obviously <img src='http://s0.wp.com/latex.php?latex=%7CO_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|O_x|' title='|O_x|' class='latex' /> and the size of each equivalence class for the second relation is obviously <img src='http://s0.wp.com/latex.php?latex=%7CS_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_x|' title='|S_x|' class='latex' />, so the result is proved. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<hr />
<p>Let me make one final remark about the orbit-stabilizer theorem. Why, one might ask, is it a useful result? A rather general answer is that it gives us a relationship between three quantities, namely <img src='http://s0.wp.com/latex.php?latex=%7CG%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|' title='|G|' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=%7CS_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|S_x|' title='|S_x|' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%7CO_x%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|O_x|' title='|O_x|' class='latex' />, that allows us to determine any one of them from the other two. Situations crop up quite frequently in group theory where it is not very easy to see what one of these quantities is directly, but quite easy to calculate the other two. The orbit-stabilizer theorem then gives us the hard one too. For example, in the case of counting rotational symmetries of a cube, it isn&#8217;t easy to think of all the rotations unless you partition them in some nice way, such as looking at what they do to a particular vertex or face, which, as we saw before, amounts to counting orbits and stabilizers. </p>
<p>It&#8217;s not always <img src='http://s0.wp.com/latex.php?latex=%7CG%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|G|' title='|G|' class='latex' /> that&#8217;s the tough one to get a handle on. For example, in question 10 of <a href="http://www.dpmms.cam.ac.uk/study/IA/Groups/2011-2012/gps311.pdf">Examples Sheet 3</a> it&#8217;s the size of the orbit that isn&#8217;t obvious. (In general with the bunch of questions around there, I recommend thinking to yourself, &#8220;What is the orbit-stabilizer theorem giving me?&#8221;)</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3720/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3720/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3720/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3720/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3720/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3720/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3720/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3720/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3720/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3720/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3720/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3720/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3720/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3720/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3720&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/11/09/group-actions-ii-the-orbit-stabilizer-theorem/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Group actions I</title>
		<link>http://gowers.wordpress.com/2011/11/06/group-actions-i/</link>
		<comments>http://gowers.wordpress.com/2011/11/06/group-actions-i/#comments</comments>
		<pubDate>Sun, 06 Nov 2011 15:42:42 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[IA Groups]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3704</guid>
		<description><![CDATA[There is something odd about the experience of learning group theory. At first, one is told that the great virtue of groups is their abstractness: many mathematical structures, from number systems, to sets of permutations, to symmetries, to automorphisms of other algebraic structures, to invariants of geometric objects (these last two are examples you won&#8217;t [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3704&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>There is something odd about the experience of learning group theory. At first, one is told that the great virtue of groups is their <em>abstractness</em>: many mathematical structures, from number systems, to sets of permutations, to symmetries, to automorphisms of other algebraic structures, to invariants of geometric objects (these last two are examples you won&#8217;t meet for a while) have important properties in common, and these are encapsulated in a small set of axioms that lead to a rich theory with applications throughout mathematics. So far so good &#8212; understanding about abstraction is wonderful and mind-expanding and the definition of a group is one of the best examples.</p>
<p>But then one studies group actions (and later group representations). They appear to be doing the reverse of abstraction: we take an abstract group and find a way of thinking of it as a group of symmetries. And that is supposed to help us understand the group better &#8212; so much so that group actions are an indispensable part of group theory. </p>
<p>So is abstraction good or bad? Well, both the views above are correct. Abstraction does indeed play a very important clarifying role, by showing us that many apparently different phenomena are basically the same, and isolating the aspects of those phenomena that really matter. However, if a group is defined for us in an abstract way (I&#8217;ll say more precisely what I mean by this later), then showing that it is isomorphic to a group of symmetries can make it much easier to answer questions about that group.</p>
<p>In this post, and one or two further ones, I want to discuss what a group action actually is, the orbit-stabilizer theorem and how to remember its proof, and how to use group actions to prove facts about groups.<br />
<span id="more-3704"></span></p>
<p><strong>What is a group action?</strong></p>
<p>There are two ways of defining group actions. I don&#8217;t know which one you were given in lectures, but most people give the first and then mention the second more as an aside.</p>
<p><strong>Definition 1.</strong> Let <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> be a group and let <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> be a set. An <em>action</em> of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is a function <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AG%5Ctimes+X%5Cto+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:G&#92;times X&#92;to X' title='&#92;phi:G&#92;times X&#92;to X' class='latex' /> with the following properties: <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28e%2Cx%29%3Dx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(e,x)=x' title='&#92;phi(e,x)=x' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X' title='x&#92;in X' class='latex' />, and <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%2C%5Cphi%28h%2Cx%29%29%3D%5Cphi%28gh%2Cx%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g,&#92;phi(h,x))=&#92;phi(gh,x)' title='&#92;phi(g,&#92;phi(h,x))=&#92;phi(gh,x)' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=g%2Ch%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g,h&#92;in G' title='g,h&#92;in G' class='latex' /> and every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X' title='x&#92;in X' class='latex' />. </p>
<p><strong>Definition 2.</strong> Let <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> be a group and let <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> be a set. Let <img src='http://s0.wp.com/latex.php?latex=S%28X%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S(X)' title='S(X)' class='latex' /> be the group of all permutations of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />. An <em>action</em> of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> on <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is a homomorphism <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AG%5Cto+S%28X%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:G&#92;to S(X)' title='&#92;phi:G&#92;to S(X)' class='latex' />.</p>
<p>I much prefer the second of these, because I find it far more intuitive: an action of a group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is a way of thinking of the elements of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> as symmetries of some sort. It&#8217;s tempting to say that an action of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is a way of regarding <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> as a symmetry group, but that&#8217;s not quite correct, because we allow different elements of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> to give us the same symmetry. For example, here&#8217;s an action of the permutation group <img src='http://s0.wp.com/latex.php?latex=S_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_n' title='S_n' class='latex' /> on the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2&#92;}' title='&#92;{1,2&#92;}' class='latex' />. There are two permutations of <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2&#92;}' title='&#92;{1,2&#92;}' class='latex' />, namely the identity and <img src='http://s0.wp.com/latex.php?latex=%2812%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)' title='(12)' class='latex' />: map all even permutations to the identity and all odd permutations to <img src='http://s0.wp.com/latex.php?latex=%2812%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)' title='(12)' class='latex' />. </p>
<p>What do we have to check in order to be sure that that is an action? If <img src='http://s0.wp.com/latex.php?latex=%5Crho&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;rho' title='&#92;rho' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' /> are elements of <img src='http://s0.wp.com/latex.php?latex=S_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_n' title='S_n' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28%5Crho%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(&#92;rho)' title='&#92;phi(&#92;rho)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28%5Csigma%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(&#92;sigma)' title='&#92;phi(&#92;sigma)' class='latex' /> are the associated permutations of <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2&#92;}' title='&#92;{1,2&#92;}' class='latex' />, then we need <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28%5Crho%5Csigma%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(&#92;rho&#92;sigma)' title='&#92;phi(&#92;rho&#92;sigma)' class='latex' /> to equal <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28%5Crho%29%5Cphi%28%5Csigma%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(&#92;rho)&#92;phi(&#92;sigma)' title='&#92;phi(&#92;rho)&#92;phi(&#92;sigma)' class='latex' />. That is an easy consequence of facts about what you get when you multiply even and odd permutations together, which correspond closely to facts about what happens when you multiply the identity and <img src='http://s0.wp.com/latex.php?latex=%2812%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)' title='(12)' class='latex' /> together. (For example, odd times odd is even, and <img src='http://s0.wp.com/latex.php?latex=%2812%29%2812%29%3D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)(12)=' title='(12)(12)=' class='latex' />identity.)</p>
<p>If <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is a homomorphism from <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=S%28X%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S(X)' title='S(X)' class='latex' />, that means that for each <img src='http://s0.wp.com/latex.php?latex=g%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in G' title='g&#92;in G' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)' title='&#92;phi(g)' class='latex' /> is a permutation of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />. (Here I am doing something easy but important: making sure I am clear in my mind about what kinds of objects things are. It&#8217;s a very good habit to get into.) That means that I will find myself writing slightly odd expressions like <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)(x)' title='&#92;phi(g)(x)' class='latex' />: since <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)' title='&#92;phi(g)' class='latex' /> is a permutation of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />, which is in turn a kind of function from <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />, I must be able to apply <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)' title='&#92;phi(g)' class='latex' /> to elements of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />. </p>
<p>If one is careful, it can be nice to imagine that the elements of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> themselves are &#8220;doing the transformation&#8221; to <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />. That is, instead of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> turning <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> into a bijection, which in turn does things to elements of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />, we allow ourselves (once we have carefully defined what the action is) to write expressions like <img src='http://s0.wp.com/latex.php?latex=g%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(x)' title='g(x)' class='latex' />, and simply understand that this is shorthand for <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)(x)' title='&#92;phi(g)(x)' class='latex' />. Then the main property that an action has to have is that <img src='http://s0.wp.com/latex.php?latex=%28gh%29%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(gh)(x)' title='(gh)(x)' class='latex' /> is the same as <img src='http://s0.wp.com/latex.php?latex=g%28h%28x%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(h(x))' title='g(h(x))' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=g%2Ch%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g,h&#92;in G' title='g,h&#92;in G' class='latex' /> and every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X' title='x&#92;in X' class='latex' />. (The first of these means you multiply <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> and then apply the transformation that corresponds to <img src='http://s0.wp.com/latex.php?latex=gh&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='gh' title='gh' class='latex' />, whereas the second means that you apply the transformation that corresponds to <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> and then the transformation that corresponds to <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' />.)</p>
<p>There is one example that is a good one to have in your head, as it gives you a very good idea of what an action is. Let <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> be the alternating group <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> and let <img src='http://s0.wp.com/latex.php?latex=T&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='T' title='T' class='latex' /> be a regular tetrahedron, and label its vertices 1, 2, 3 and 4. For each permutation <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' /> in <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> and each position of <img src='http://s0.wp.com/latex.php?latex=T&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='T' title='T' class='latex' />, we can find a rotation that permutes the vertices of <img src='http://s0.wp.com/latex.php?latex=T&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='T' title='T' class='latex' /> according to <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' />. For example, to achieve the permutation <img src='http://s0.wp.com/latex.php?latex=%2812%29%2834%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)(34)' title='(12)(34)' class='latex' /> we do a half turn about the line that joins the midpoints of the edges linking 1 to 2 and 3 to 4, and to achieve the permutation <img src='http://s0.wp.com/latex.php?latex=%28123%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(123)' title='(123)' class='latex' /> we do rotation through 120 degrees through the line that joins vertex 4 to the middle of the opposite face. (Note that these axes depend on the position of <img src='http://s0.wp.com/latex.php?latex=T&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='T' title='T' class='latex' /> rather than being fixed. See the discussion in <a href="http://gowers.wordpress.com/2011/10/16/permutations/#comment-12397">the post on permutations</a>.)</p>
<p>The action just described is a <em>faithful action</em>, which means that different elements of the group correspond to different transformations. (More formally, the homomorphism from <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=S%28X%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S(X)' title='S(X)' class='latex' /> is an injection.) However, we can also use this set-up to define a second action of <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' />. Let <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> be the set that consists of the three lines that join midpoints of opposite edges. (That is, <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is a set with three elements, each of which is a line.) Then any rotation of the tetrahedron will also permute these three lines, so <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> acts on <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />. This action is not faithful: for example, a half turn about one of the lines fixes all three lines (the other two rotate through 180 degrees but they still map to themselves). In a later post, we shall see that this gives us a very clear explanation of an important fact about the group <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' />. </p>
<p><strong>Group presentations.</strong></p>
<p>I want to end this post by elaborating on what I meant by &#8220;defining a group in an abstract way&#8221;. You should by now have met the dihedral groups. The dihedral group <img src='http://s0.wp.com/latex.php?latex=D_%7B2n%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_{2n}' title='D_{2n}' class='latex' /> of order <img src='http://s0.wp.com/latex.php?latex=2n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2n' title='2n' class='latex' /> can be defined in two rather different ways. The first way is concrete: it is the group of symmetries of an <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />-gon. By that I don&#8217;t really mean that its elements have to be symmetries of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />-gons, but rather that any group that is isomorphic to the group of symmetries of the <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />-gon counts as an instantiation of the abstract group <img src='http://s0.wp.com/latex.php?latex=D_%7B2n%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_{2n}' title='D_{2n}' class='latex' />.</p>
<p>Another way of defining groups is by using <em>generators and relations</em>. This is called giving a <em>presentation</em> of the group. In the case of the group <img src='http://s0.wp.com/latex.php?latex=D_%7B2n%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_{2n}' title='D_{2n}' class='latex' /> the usual presentation uses two generators, <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' />, say, and the relations <img src='http://s0.wp.com/latex.php?latex=a%5E2%3Db%5En%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a^2=b^n=e' title='a^2=b^n=e' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=aba%3Db%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='aba=b^{-1}' title='aba=b^{-1}' class='latex' />. It&#8217;s not hard to use these relations to reduce every product you can make out of <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=a%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a^{-1}' title='a^{-1}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b^{-1}' title='b^{-1}' class='latex' /> to an element of the form <img src='http://s0.wp.com/latex.php?latex=b%5Ej&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b^j' title='b^j' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=ab%5Ej&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab^j' title='ab^j' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=0%5Cleq+j%5Cleq+n-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='0&#92;leq j&#92;leq n-1' title='0&#92;leq j&#92;leq n-1' class='latex' />. For example, if <img src='http://s0.wp.com/latex.php?latex=n%3D5&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=5' title='n=5' class='latex' /> (so we are talking about the symmetry group of the pentagon), and I take the product <img src='http://s0.wp.com/latex.php?latex=ab%5E7a%5E%7B-3%7Db%5E%7B-2%7Dab%5E%7B-4%7Da%5E2b%5E%7B-2%7Da&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab^7a^{-3}b^{-2}ab^{-4}a^2b^{-2}a' title='ab^7a^{-3}b^{-2}ab^{-4}a^2b^{-2}a' class='latex' />, then I can do a series of obvious simplifications as follows. First, since <img src='http://s0.wp.com/latex.php?latex=a%5E2%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a^2=e' title='a^2=e' class='latex' />, I can change every power of <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> to either <img src='http://s0.wp.com/latex.php?latex=e&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e' title='e' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />. If I do that, I get <img src='http://s0.wp.com/latex.php?latex=ab%5E7ab%5E%7B-2%7Dab%5E%7B-6%7Da&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab^7ab^{-2}ab^{-6}a' title='ab^7ab^{-2}ab^{-6}a' class='latex' />. In a similar way, I can change all powers of <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> so that they are between 0 and 4. That gets me to <img src='http://s0.wp.com/latex.php?latex=ab%5E2ab%5E3ab%5E4a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab^2ab^3ab^4a' title='ab^2ab^3ab^4a' class='latex' />. Thirdly, the relation <img src='http://s0.wp.com/latex.php?latex=aba%3Db%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='aba=b^{-1}' title='aba=b^{-1}' class='latex' /> implies that <img src='http://s0.wp.com/latex.php?latex=ba%3Dab%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ba=ab^{-1}' title='ba=ab^{-1}' class='latex' />. That is, I can move an <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> from the right of a <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> to the left of that <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> if I change the <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> to a <img src='http://s0.wp.com/latex.php?latex=b%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b^{-1}' title='b^{-1}' class='latex' />. From that it follows that I can do the same with powers of <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' />. For example,<br />
<img src='http://s0.wp.com/latex.php?latex=b%5E3a%3Dbbba%3Dbbab%5E%7B-1%7D%3Dbab%5E%7B-1%7Db%5E%7B-1%7D%3Dab%5E%7B-1%7Db%5E%7B-1%7Db%5E%7B-1%7D%3Dab%5E%7B-3%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b^3a=bbba=bbab^{-1}=bab^{-1}b^{-1}=ab^{-1}b^{-1}b^{-1}=ab^{-3}' title='b^3a=bbba=bbab^{-1}=bab^{-1}b^{-1}=ab^{-1}b^{-1}b^{-1}=ab^{-3}' class='latex' />.</p>
<p>Going back to the expression <img src='http://s0.wp.com/latex.php?latex=ab%5E2ab%5E3ab%5E4a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab^2ab^3ab^4a' title='ab^2ab^3ab^4a' class='latex' />, I can use the above fact to change it to<br />
<img src='http://s0.wp.com/latex.php?latex=ab%5E2ab%5E3aab%5E%7B-4%7D%3Dab%5E2ab%5E3b%5E%7B-4%7D%3Dab%5E2ab%5E%7B-1%7D%3Dab%5E2ab%5E4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab^2ab^3aab^{-4}=ab^2ab^3b^{-4}=ab^2ab^{-1}=ab^2ab^4' title='ab^2ab^3aab^{-4}=ab^2ab^3b^{-4}=ab^2ab^{-1}=ab^2ab^4' class='latex' />.<br />
Applying the little fact again, we get<br />
<img src='http://s0.wp.com/latex.php?latex=ab%5E2ab%5E4%3Daab%5E%7B-2%7Db%5E4%3Db%5E%7B-2%7Db%5E4%3Db%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab^2ab^4=aab^{-2}b^4=b^{-2}b^4=b^2' title='ab^2ab^4=aab^{-2}b^4=b^{-2}b^4=b^2' class='latex' />.<br />
So we have a simple algorithm for putting all &#8220;words&#8221; (that is, expressions made out of <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=a%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a^{-1}' title='a^{-1}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b^{-1}' title='b^{-1}' class='latex' />) into a <em>standard form</em>. With a bit more effort, one can show that no two expressions in standard form are equal. For example, if <img src='http://s0.wp.com/latex.php?latex=ab%5E3%3Db%5E4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab^3=b^4' title='ab^3=b^4' class='latex' /> we would deduce that <img src='http://s0.wp.com/latex.php?latex=a%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a=b' title='a=b' class='latex' /> (by multiplying both sides on the right by <img src='http://s0.wp.com/latex.php?latex=b%5E%7B-3%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b^{-3}' title='b^{-3}' class='latex' />), which is false.</p>
<p>Actually, why is it false? How can we be sure that there isn&#8217;t some strange way of using the relations <img src='http://s0.wp.com/latex.php?latex=a%5E2%3Db%5E5%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a^2=b^5=e' title='a^2=b^5=e' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=aba%3Db%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='aba=b^{-1}' title='aba=b^{-1}' class='latex' /> to show that <img src='http://s0.wp.com/latex.php?latex=a%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a=b' title='a=b' class='latex' />? One quick answer is that we can find a concrete group &#8212; the symmetry group of a pentagon &#8212; and two elements of that group &#8212; one reflection and one rotation &#8212; that satisfy the given relations. If those relations implied that <img src='http://s0.wp.com/latex.php?latex=a%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a=b' title='a=b' class='latex' /> then they would enable us to deduce that a reflection of the pentagon was equal to a rotation of the pentagon, which is just plain false. </p>
<p>That same argument shows that <img src='http://s0.wp.com/latex.php?latex=D_%7B10%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_{10}' title='D_{10}' class='latex' /> has at least 10 elements, and since it has at most 10 elements (since there are 10 distinct standard forms) it has exactly 10 elements. By using the standard-form algorithm we can build up the multiplication table. For example, <img src='http://s0.wp.com/latex.php?latex=ab%5E3ab%5E2%3Da%5E2b%5E%7B-3%7Db%5E2%3Db%5E4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab^3ab^2=a^2b^{-3}b^2=b^4' title='ab^3ab^2=a^2b^{-3}b^2=b^4' class='latex' />, and so on.</p>
<p>So now we have two ways of thinking about <img src='http://s0.wp.com/latex.php?latex=D_%7B2n%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_{2n}' title='D_{2n}' class='latex' />. Either it is the group with generators <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> and relations <img src='http://s0.wp.com/latex.php?latex=a%5E2%3Db%5En%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a^2=b^n=e' title='a^2=b^n=e' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=aba%3Db%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='aba=b^{-1}' title='aba=b^{-1}' class='latex' />, or it is the group of symmetries of an <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />-gon.</p>
<p>The main point of what I want to say in this section is that there is a danger that you will become too keen on abstraction. Certain facts about <img src='http://s0.wp.com/latex.php?latex=D_%7B2n%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_{2n}' title='D_{2n}' class='latex' /> are <em>obvious</em> if you think of it as a symmetry group and quite a lot less obvious if you argue directly from a presentation. For example, <img src='http://s0.wp.com/latex.php?latex=D_%7B12%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_{12}' title='D_{12}' class='latex' /> contains (a copy of) <img src='http://s0.wp.com/latex.php?latex=D_6&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_6' title='D_6' class='latex' /> as a subgroup. One proof of that fact consists in arguing as follows. Let <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> be the generators of <img src='http://s0.wp.com/latex.php?latex=D_6&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_6' title='D_6' class='latex' /> and let <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> be the generators of <img src='http://s0.wp.com/latex.php?latex=D_%7B12%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_{12}' title='D_{12}' class='latex' />. Then the function that takes <img src='http://s0.wp.com/latex.php?latex=b%5Ej&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b^j' title='b^j' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=d%5E%7B2j%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d^{2j}' title='d^{2j}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=ab%5Ej&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab^j' title='ab^j' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=cd%5E%7B2j%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='cd^{2j}' title='cd^{2j}' class='latex' /> is an isomorphism from <img src='http://s0.wp.com/latex.php?latex=D_6&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_6' title='D_6' class='latex' /> to its image, which is a subgroup of <img src='http://s0.wp.com/latex.php?latex=D_%7B12%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_{12}' title='D_{12}' class='latex' />. Checking this is slightly fiddly, though not too hard.</p>
<p>How much more transparent, however, is the following argument. <img src='http://s0.wp.com/latex.php?latex=D_%7B12%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_{12}' title='D_{12}' class='latex' /> is the group of symmetries of a regular hexagon. If you join alternate vertices of the hexagon you get an equilateral triangle. All the symmetries of that triangle are given in an obvious way by symmetries of the hexagon, so the symmetry group of the triangle is a subgroup of the symmetry group of the hexagon.</p>
<p>I shall have more to say about group actions, but for now I&#8217;ll content myself with this message (which I&#8217;ve said a few times, but let me say it once more).</p>
<p><em>Abstraction is great, but don&#8217;t get too carried away with it. In particular, if you know that a group is isomorphic to a group of symmetries, that gives you direct access to a lot of information about it. Don&#8217;t throw that information away (unless for some reason you like complicated fiddly proofs that don&#8217;t tell you why a result is true).</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3704/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3704/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3704/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3704/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3704/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3704/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3704/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3704/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3704/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3704/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3704/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3704/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3704/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3704/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3704&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/11/06/group-actions-i/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>A more modest proposal</title>
		<link>http://gowers.wordpress.com/2011/11/03/a-more-modest-proposal/</link>
		<comments>http://gowers.wordpress.com/2011/11/03/a-more-modest-proposal/#comments</comments>
		<pubDate>Thu, 03 Nov 2011 17:47:54 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Mathematics on the internet]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3678</guid>
		<description><![CDATA[In my previous post I suggested a way in which an online system of submitting and commenting on papers might perhaps work better than our current system of journals, editors and anonymous referees. I am very grateful to all who commented, both positively and (more often) negatively. It has given me a lot to think [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3678&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In my previous post I suggested a way in which an online system of submitting and commenting on papers might perhaps work better than our current system of journals, editors and anonymous referees. I am very grateful to all who commented, both positively and (more often) negatively. It has given me a lot to think about. One thing that I wasn&#8217;t expecting, but should have expected, was that a number of people just plain don&#8217;t like the idea of an online alternative, regardless of the rational arguments. I don&#8217;t mean that there aren&#8217;t arguments to back up the dislike &#8212; merely, that I think that there is a dislike there, which becomes an argument in itself, since if many people have an emotional reaction against a new system, then that makes it less likely that the system will be adopted by enough people to become as officially recognised as the journal system. To avoid misunderstanding, let me stress that I&#8217;ve got nothing against emotional reactions, as long as they are backed up with arguments; and in the comments on my previous post they have been. Indeed, the arguments against various aspects of what I suggested have caused me to realize that there are some disadvantages I didn&#8217;t think of and others that I underestimated. </p>
<p>In this post, I want to summarize the points made in the comments (for the benefit of anyone who is interested in what was said but doesn&#8217;t have time to read through them all), and then make a second suggestion, which I think deals with a number of objections to the first. As with the first, I don&#8217;t see the details as set in stone. I think it&#8217;s an improvement on the first, but doubtless it can itself be improved on. Whether it reaches the level where one should actually consider trying to implement it is of course quite another matter. But I do think that these issues should be discussed: if we were designing a system from scratch for disseminating and evaluating mathematical output, I don&#8217;t think we would come up with the current journal system, though of course that&#8217;s not the situation, and historical accidents often result in quite good ways of doing things.<br />
<span id="more-3678"></span></p>
<hr />
<p><strong>Summary of the reaction to the previous post.</strong></p>
<p>I&#8217;ll number the reactions and attribute them, with links to the comments where they were expressed (in more detail). This isn&#8217;t a comprehensive list of objections &#8212; more like a list of the objections that have had an influence on the new suggestion. (Even then it may not be complete &#8212; apologies to anyone that I accidentally miss out.)</p>
<p>1. <a href="http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/#comment-12779">Andrew Stacey</a>. The incentive system I proposed (roughly speaking, Mathoverflow-type reputation points) will not be enough to make people contribute. Similar attempted sites have failed, and this may be because what people need as motivation is a direct and immediate benefit from contributing.</p>
<p>2. <a href="http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/#comment-12784">Andy P</a>. Even if a new system is demonstrably better for mathematicians, it still needs to be taken seriously by people in other subjects who have power over mathematicians (e.g. when handing out money). Everyone understands how peer-reviewed journals work, but that won&#8217;t be the case for some new website.</p>
<p>3. <a href="http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/#comment-12787">Alexander Woo</a>. We need a way of rapidly sifting out the vast majority of candidates for positions that may attract hundreds of applications. A website with detailed narrative descriptions of papers will make that an impossibly long process.</p>
<p>4. <a href="http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/#comment-12788">Henry Cohn</a>. Mathoverflow reputation points work because we know that they&#8217;re just a game. If they actually mattered, then abuses of the system and all manner of unpleasantness would be much more likely.</p>
<p>5. <a href="http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/#comment-12804">Yla Tausczik</a>. A useful aspect of the current system is that journals fix an official version of an article, which can then be the one that other articles refer to.</p>
<p>6. <a href="http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/#comment-12810">Scott Morrison</a>. To have a realistic chance of success, any proposal should be <em>incremental</em> rather than revolutionary.</p>
<p>7. <a href="http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/#comment-12814">David Savitt</a>. One of the most valuable aspects of the current system is the kind of nit-picking feedback that ends up improving the presentation of a paper. People would be unwilling to provide that except anonymously &#8212; if, that is, they could be bothered to provide it at all.</p>
<p>8. <a href="http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/#comment-12816">Noah Snyder</a>. So much is published that, whatever incentive systems one tries to provide, most of it would simply not be looked at.</p>
<p>9. <a href="http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/#comment-12827">Super Mario.</a> It is <em>very</em> hard to persuade a lot of people to use a website, and the success or failure of attempts is sometimes extremely sensitive to tiny details of how the site works.</p>
<p>10. <a href="http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/#comment-12840">Andy P</a>. The journal system works just fine, so trying to devise a new system is pointless, and potentially damaging.</p>
<p>11. <a href="http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/#comment-12942">Shahab</a> also makes a number of interesting points, too long to summarize here.</p>
<hr />
<p><strong>The new suggestion.</strong></p>
<p>Imagine the following common situation. You&#8217;ve worked for some time on a problem, and finally you&#8217;ve proved something interesting enough to publish &#8212; or so it seems to you anyway. So you write a preprint. Are you happy with it? Yes, up to a point, but you have a few residual anxieties, such as whether you&#8217;ve really got all those technical lemmas correct, whether you&#8217;ve mentioned all the relevant previous work, whether your write-up is going to be comprehensible to anyone else, etc. etc. Wouldn&#8217;t it be nice to get some detailed feedback before you submit it for publication? But who is going to be prepared to put in the work it takes to check calculations, comment on presentation, and so on?</p>
<p>That&#8217;s where www.howsmypreprint.com comes in. (The suggestion for a name is of course not serious.) You put your preprint on the arXiv and you create a page on Howsmypreprint, wait a little while, and then a few weeks later (or perhaps much sooner if the system really works) you get a list of typos, small errors, big errors if there are any, suggestions for how to improve the presentation, and so on.</p>
<p>But why is anybody ready to do that for you? Here&#8217;s where a suggestion of Andrew Stacey comes in: if you want your paper checked over in this way, then you have to pay for that service by doing it for other people. In other words, you accrue points for working on other people&#8217;s papers, and you spend them when they do work on yours.</p>
<p>That&#8217;s the basic idea. There are many details to discuss, but first let me say why I think that in principle it can deal with almost all the objections above. The numbers here will correspond to the numbers up there.</p>
<p>1. There is now a genuine selfish incentive for contributing to the site. If it is seen to work, then people will be keen to use the service, and therefore keen to contribute.</p>
<p>2. The site is not intended to supplant the journal system. It is meant to provide a new service. </p>
<p>However, it could have a profound influence on the journal system. For instance, if I get detailed feedback on my preprint, I could then submit not just the paper but the feedback too. Then the work of the journal could be greatly reduced: all they need from the referee is an assessment of how interesting the paper is, and the difficult bit &#8212; reading it carefully and making lots of suggestions &#8212; has been done already. Some journals might start to insist that all their submissions must first have spent a certain period of time on the site.</p>
<p>3. Since journals still survive, we still have a rapid sifting mechanism.</p>
<p>4. The points on the site are no longer reputation points &#8212; they are &#8220;brownie points&#8221;. In case that&#8217;s just a UK expression, I mean that they are rewards for a service rather than an indication of how amazing you are. Also, since the purpose of the site is not to <em>evaluate</em> papers, there isn&#8217;t much reason to game the system. (The only one I can think of is trying to earn lots of points by giving rather rushed and incomplete feedback. I&#8217;ll discuss that potential problem later.)</p>
<p>5. Not a problem as journals still survive.</p>
<p>6. This system is incremental rather than revolutionary, since it is an addition to what we have now, which could gradually replace certain aspects of it (the main one being that the hard work done by referees would be done at a different stage of the process).</p>
<p>7. Not a problem &#8212; the feedback could be provided anonymously.</p>
<p>8. If the points system was properly calibrated (which might be a challenge) then something like Kirchoff&#8217;s current law ought to apply: on average, if you contributed to the site, you would be rewarded for your contribution. To put it more crudely, all those authors writing uninteresting papers would be helping out with other uninteresting papers.</p>
<p>9. I can&#8217;t say that this proposal addresses the problem that it&#8217;s hard to predict what will work. </p>
<p>10. I think most of these objections don&#8217;t apply to this revised proposal.</p>
<p>So I&#8217;ve ended up saying that all the objections that a new proposal can reasonably be expected to deal with have been dealt with. However, that leaves another question: does this suggestion throw away so much that it ends up being pointless? If all that happens is that you sometimes get feedback on a paper before you submit it instead of getting it after you&#8217;ve submitted it, has anything significant changed?</p>
<p>I&#8217;ll discuss this at some length, but before I do, I&#8217;d remind you of Scott Morrison&#8217;s point, that change should be evolutionary. This is meant as a first step (which might be the only step we ever wanted to take), so part of the point of it is that it is <em>not</em> a big change. What I&#8217;d like to argue is that it&#8217;s a good small change, and that it could potentially lead to further evolutionary steps &#8212; by the gradual addition of extra features to the site.</p>
<p>But suppose that we just stick with the proposal above, which leaves the work of evaluating, certifying and archiving to journals but potentially takes from journals the more arduous task of reading carefully through submissions. Doesn&#8217;t that just leave everybody doing the same amount of work, with no further benefits?</p>
<p>I think not. One benefit of this system would be that the voluntary work we do by critically reading other people&#8217;s papers would be coupled more closely to the benefit we get from having our own papers critically read. At the moment, if we do a lazy job, or sit on a paper for a long time, or refuse to referee it because we are too busy, almost nobody will find out, so the negative consequences for us are close to zero. With the online system (which, remember, is first and foremost a supplement to the current system, which would only gradually come to replace certain aspects of it), we would be putting in the work in order to earn a reward. That would feel fair.</p>
<p>I also think that carefully reading a paper and making suggestions for improvements is a very different process from deciding whether it is good enough for a particular journal. This system could in principle decouple these two tasks. One person could do the careful reading and report via the website. Another, for the journal, could make a judgment on its suitability for publication. What&#8217;s more, the second person would be looking at a revised and improved paper, and would (if things work as I envisage them) have access to the report of the first person. So they would be making a judgment with more information available. I think this would make the job of refereeing papers for journals much less painful and much more streamlined. Something like it happens already when one is asked to give a quick judgment on whether it is worth refereeing a paper at all, but wouldn&#8217;t it be better to make those quick judgments <em>after</em> a paper has been through the &#8220;cleansing&#8221; process? And wouldn&#8217;t it be better for people who find it hard to get their results published if they could at least get some feedback?</p>
<p>Perhaps those would be fairly small gains &#8212; I&#8217;m not sure &#8212; but an online system would come with a lot of flexibility that the current journal system does not provide, which could potentially add considerably to those gains. For example, if one added back some of the features I suggested earlier, like the possibility of offering constructive comments on other people&#8217;s work, then the journal referee would have more information to go on, such as how other people in the area were reacting to the paper. Another gain that I&#8217;ve already mentioned is that it would be easy to allow different kinds of mathematical document to receive feedback, even if they were not intended for journal publication.</p>
<p><strong>Fine details.</strong></p>
<p>There is a potential problem with the points system, which is that it wouldn&#8217;t be right to reward people for giving just any feedback to papers: it has to be useful feedback, of a kind that demands quite a bit of time. It would be unfair if people were to receive detailed and helpful feedback in return for having offered feedback that merely mentioned one or two easily spotted typos here and there.</p>
<p>How can this problem be overcome? One idea is that when a report is offered on a paper, the author of that paper can say how satisfied they are with the report, on a scale from 1 to 5, say. So if your report says merely, &#8220;This looks OK to me, except that on page 2 line 5 you&#8217;ve written &#8220;the the&#8221;,&#8221; then you won&#8217;t get rewarded very much. I think it might be an idea to have a feature a bit like Mathoverflow&#8217;s &#8220;acceptance&#8221; of answers: if somebody does such a good job on your paper that it&#8217;s clear that there&#8217;s no need for anyone else to make a comprehensive list of detailed suggestions, then you &#8220;accept&#8221; their report &#8212; and they are duly rewarded. But the satisfaction mark could take into account how difficult you thought your paper was to work through in the first place. </p>
<p>Should these reports be public, and should they be anonymous? One possibility is that the writer of the report could decide whether he or she wanted to be named and was willing for the report to be made public. The author would also have a say in whether the report was public. If both referee and author were happy to have the report made public, then it would be. One could also have a private link to the report, which the author could make available to the journal to which he or she decides to submit the paper.</p>
<p>Another feature one might have is a sort of reverse acceptance, where the writer of a report would tick a box to confirm that a new draft of the author has dealt satisfactorily with the suggestions made. Again, this information could speed up the process of conventional publication considerably.</p>
<p>What if the author of the paper unfairly fails to recognise the hard work put in by somebody who writes a report? I don&#8217;t see an easy solution to this if the report remains private. If it&#8217;s public, then the unfairness of the author would be there for all to see, but not if it&#8217;s private. However, I think that only in rather difficult, exceptional cases would there be any reason for authors to behave in this way. Perhaps some people would be a little ungenerous, but if the referee had put in a lot of work, then surely the vast majority of authors would be happy to reward it appropriately.</p>
<p>A very simple additional feature that could be helpful is &#8220;certification buttons&#8221; that you press to give some useful information to other people. One might be, &#8220;This is a serious mathematical paper.&#8221; It wouldn&#8217;t say anything about whether the paper was correct, but just that it wasn&#8217;t the work of a crank. If you pressed that button, you would get a very small addition to your points, and it would be a matter of public record that you had pressed it. (The same would go for all certification buttons, to help people judge the value of the certifications.)</p>
<p>Another might be, &#8220;I haven&#8217;t checked in detail, but I&#8217;m confident that this proof is essentially correct. Yet another, for which more points would be on offer, could be, &#8220;I have checked carefully and am happy to confirm that the proof is essentially correct.&#8221; (That wouldn&#8217;t be a guarantee that every last detail was correct, but just that the certifier &#8212; who would be named &#8212; was very confident in the results.)</p>
<p>What happens if cranks start certifying each other&#8217;s papers? There are many possible answers to this. One I like is due to Noam Nisan (or at least, I got it from <a href="http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/#comment-12831">this comment of his</a>), which is to set up &#8220;networks of trust&#8221;. If for any reason I decide that I trust the judgments of some reviewer, I click on a box that creates an edge between me and that reviewer (in a graph of which we are both vertices). Various algorithms can be used to derive information from the resulting graph about who to trust if you begin with some small group of people that you <em>definitely</em> trust. And official institutions could set up their own networks, possibly making them public. </p>
<p><strong>Summary.</strong></p>
<p>As I see it, the main properties of this second suggestion are these.</p>
<p>(i) It is designed to be a useful supplement to the journal system rather than a replacement for it.</p>
<p>(ii) It could streamline the work we do for journals.</p>
<p>(iii) The incentive for working on this site would be that others would do the same for you. (That&#8217;s sort of true for the current system, except that if you don&#8217;t do your share of the work, others still do it for you.)</p>
<p>(iv) If somebody didn&#8217;t want to have anything to do with the new system, that would be fine.</p>
<p>(v) It would be easy to add further features to the system, which could allow both it and the journal system to evolve. Here are three examples. First, if a simple certification system could tell us that people we trusted had judged a paper to be serious and almost certainly correct, we might well have, for many papers, all the information we needed for metrics, sifting out of job applications and the like. We might find we could get by with far fewer journals. Second, if people wanted to experiment with ideas like virtual journals, it would be easy for them to do so. Third, one could make it possible to give feedback in the form of smallish comments that were different in style from the detailed reports that would be the main purpose of the site, but also useful.</p>
<p>Before I stop, let me mention one other feature that I&#8217;d like to see, which I forgot to mention earlier. It&#8217;s that everyone would start with a credit of say three papers (maybe more if they were PhD students). That&#8217;s partly so that the system can get started at all, and partly because beginning mathematicians probably need to get a few papers under their belts before they start refereeing the work of other people. (That said, many graduate students work through recently published papers of more senior people, and could in principle offer extremely useful feedback. That wouldn&#8217;t be ruled out at all.)</p>
<p>Another thing I forgot to mention is that since points would be just for earning the right to have feedback on your submissions, there would be no need to make them public, and so no unhealthy competition.</p>
<p>Yet another thing I forgot to mention is Andy P&#8217;s view that the journal system ain&#8217;t broke so we shouldn&#8217;t fix it. Rather than comment on this, I refer you to <a href="http://agtb.wordpress.com/2011/11/02/the-problem-with-journals/">Noam Nisan&#8217;s elegantly written response</a> (to which Andy P in turn responds).</p>
<p>Added later: I make a further suggestion in <a href="http://gowers.wordpress.com/2011/11/03/a-more-modest-proposal/#comment-13036">this comment below</a>, which I think could significantly improve the chances of a site like this working.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3678/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3678/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3678/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3678/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3678/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3678/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3678/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3678&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/11/03/a-more-modest-proposal/feed/</wfw:commentRss>
		<slash:comments>86</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>How might we get to a new model of mathematical publishing?</title>
		<link>http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/</link>
		<comments>http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/#comments</comments>
		<pubDate>Mon, 31 Oct 2011 19:04:14 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Mathematics on the internet]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3645</guid>
		<description><![CDATA[This is a post I&#8217;ve been intending to write for several months, but now seems to be quite a good moment, since the issue is in the air somewhat. For example, I&#8217;ve just read a post by Michael Nielsen on a similar topic, which itself was responding to things that other people have written. However, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3645&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This is a post I&#8217;ve been intending to write for several months, but now seems to be quite a good moment, since the issue is in the air somewhat. For example, I&#8217;ve just read <a href="http://michaelnielsen.org/blog/open-access-a-short-summary/">a post by Michael Nielsen</a> on a similar topic, which itself was responding to things that other people have written. However, he is addressing a different issue: that of the restriction of access to journal articles once they are published. I am more interested in whether mathematicians really need journal articles at all, now that we have the internet. (Just in case anyone hasn&#8217;t noticed, this post is not part of the series about first-year mathematics at Cambridge &#8230;)</p>
<p>Before I go any further, let me make clear right from the outset that I&#8217;m not merely saying, &#8220;We can stick our papers on the internet, so let&#8217;s forget about journals.&#8221; I think that journals still have a vital role to play, even though the internet exists. However, like many people, I do not think it is at all obvious that they will <em>continue</em> to have a vital role to play, so I&#8217;d like to discuss two questions.</p>
<p>1. If we didn&#8217;t have journals, then what might we have instead?</p>
<p>2. How could the change from journals to whatever replaces them actually take place?<br />
<span id="more-3645"></span></p>
<p>To answer the first question I&#8217;d like to propose a model that is similar to models that others have proposed. I know this not from being familiar with the models themselves, but from having read sentences like, &#8220;Several people have suggested models where &lt;description of type of model that I am about to propose&gt;.&#8221; If anyone can provide me with links to similar suggestions, I&#8217;ll be happy to insert them.</p>
<p>What I think could work is something like a cross between the arXiv, a social networking site, Amazon book reviews, and Mathoverflow. I&#8217;ll try to describe what it might look like, and then defend the suggestion against several objections that I think are likely to occur to people. (If you still have objections, I&#8217;ll be interested to know what they are. But I hope you will read this sympathetically, and for the sake of balance consider the disadvantages of the current journal system as well.)</p>
<p>In an ideal world, this new website would simply be an extension of the arXiv, but if that couldn&#8217;t be organized, then instead it would be a separate site with links to the arXiv. Then if you proved a result, you would post it on the arXiv as you do now, but you would also start a page on this website, which would be devoted to your paper. The page would have various subject classifications &#8212; possibly the same ones as on the arXiv.</p>
<p>Once your paper was up, anybody who wanted to would be able to respond to it. Posting a response would be a bit like posting an answer to a question on Mathoverflow, and as with Mathoverflow it would be possible to make comments, either on the paper itself or on people&#8217;s responses to it. Also as with Mathoverflow, it would be possible to vote up people&#8217;s responses, which would give them reputation points.</p>
<p>What would be in a typical response? It could be anything useful: obvious possibilities would be a description of the main results for the non-expert, an evaluation of the paper, an explanation of what the paper contributes to its area, an assessment of how big that &#8220;area&#8221; is, a list of minor errors, and so on.</p>
<p>In order to motivate what I shall say next, let me list some potential problems with the idea as I have described it so far. </p>
<p>1. Nobody would have any motivation to contribute to the website.</p>
<p>2. If referees were anonymous, then any &#8220;reputation&#8221; accrued on the site wouldn&#8217;t translate into genuine reputation.</p>
<p>3. But if referees were not anonymous, then they might be unwilling to write about the papers they understood best, for fear of offending people they knew.</p>
<p>4. People might not like to have their papers evaluated in public.</p>
<p>5. Some papers might not get evaluated at all.</p>
<p>6. Some papers by well-established people might be more favourably reviewed than equally good papers by less well-established people.</p>
<p>7. If somebody wrote an unjustly negative evaluation, it would be read by everybody who looked up the paper, thereby unfairly damaging the reputation of the author.</p>
<p>8. If someone gets a paper published in a good journal, that really means something. But if all they got was a few comments on a website, that wouldn&#8217;t be the same sort of stamp of quality or certificate of correctness.</p>
<p>Let me deal with these points in turn. </p>
<p><em>1. Nobody would have any motivation to contribute to the website.</em></p>
<p>One could say the same of Mathoverflow, and yet many people contribute high-quality answers to that. Why do they do so? I can think of at least three possible reasons: the questions are often quite interesting, people like accruing reputation points (however silly that might seem), and people who have a good answer to a question like the chance to display their erudition in public.</p>
<p>Could the same factors work for a paper-evaluation site? Well, some papers would certainly be interesting and a pleasure to comment on, but what about run-of-the-mill papers? I think even these could be of interest to people in the areas concerned, but I have to admit that for many papers, mere mathematical interest alone will not be a sufficient motivation for commenting on them.</p>
<p>Reputation points on the site would, I think, provide a significant extra motivation, which could apply even to evaluations of uninteresting papers. It would take a display of skill and judgment, not to mention work, to evaluate such papers well, and if this was rewarded with reputation points, one would feel appreciated for doing it. And the third motivation, the chance to show off one&#8217;s erudition, would be there too.</p>
<p>How does that compare with the current system? Well, I&#8217;d say that the main reason I referee papers is that I find it hard to say no. Sometimes this is because I get a good and interesting paper that relates in some way to my work, so I can see that I am a fairly obvious choice of referee and would feel guilty about refusing. But sometimes I get a paper that is not really about anything that I care about but is officially in my area, or an area that I have at some point worked in, and I accept for some much sillier reason, such as that I forgot about the email and when reminded feel too guilty to refuse.</p>
<p>So as the system is now, refereeing is basically a chore: we know we ought to do some of it, because we want our own papers to be refereed &#8212; or the whole system would break down. However, because refereeing is private and anonymous, we get almost no recognition for our efforts, so although there is a fairly strong motivation for agreeing to referee a paper (the wish not to feel guilty), there is very little motivation for doing more than a merely adequate job once one has agreed.</p>
<p><em>2. If referees were anonymous, then any &#8220;reputation&#8221; accrued on the site wouldn&#8217;t translate into genuine reputation.</p>
<p>3. But if referees were not anonymous, then they might be unwilling to write about the papers they understood best, for fear of offending people they knew.</em></p>
<p>I have a proposal for dealing with these twin problems. Referees would have the option of writing under their real names, or under a pseudonym, or even a bit of both. When you registered for the site, you would give your details and whatever names and pseudonyms you wanted to use. Then any reputation points you earned would be attached to all those names and pseudonyms. In particular, if anybody looked you up (by name) they could find out how much reputation you had accrued, and therefore get some idea of how much you were contributing to the mathematical community in this way.</p>
<p>An obvious objection is that it would become easy to guess who was writing under a certain pseudonym, because you could simply look at users in the relevant area until you found one with the same reputation. To deal with that, I would suggest not giving the exact number of reputation points next to pseudonyms. One method might be to have very broad categories such as &#8220;high&#8221;, &#8220;medium&#8221; and so on &#8212; the minimum information that would allow one to form a reasonable judgment about how seriously to take the referee. Another might be to add some randomized errors to the reputation points associated with the pseudonyms, just to disguise their relationship with actual names.</p>
<p>So what I am suggesting <em>would</em> allow reputation points to translate into genuine reputation, even if you wanted to write pseudonymously. I would add that in my opinion it would be better if people wrote under their own names except under quite unusual circumstances. If this was part of the ethos of the site, then people would be less inclined to upvote responses offered by people writing under a pseudonym. But if someone felt that they could say what they thought only under the cloak of pseudonymity, then that could at least be an option.</p>
<p><em>4. People might not like to have their papers evaluated in public.</em></p>
<p>This is, I think, a much more complicated issue. Suppose you are a talented young mathematician, just beginning research. Suppose also that you lack confidence &#8212; you&#8217;re quite pleased with your results but you have no idea how good others will think they are. At present, you can get some private feedback by simply submitting to a journal and waiting for the referee&#8217;s report. With the system I&#8217;m suggesting, you would have your work dissected, and possibly criticized, in public. </p>
<p>I think this is quite a serious potential problem, but I have a suggestion for mitigating it, which is that moderators on the site should adopt a very strong policy of discouraging negative comments, and this too should be part of the ethos of the site. If you think that a result is actually incorrect, then of course you should politely point out the mistake. (Perhaps in that case, to spare people&#8217;s blushes, it could be possible to withdraw a paper from the site, even if it is stuck there on the arXiv.) But if you merely find it completely uninteresting, then you should indicate that in a &#8220;positive&#8221; way, as is done on MathSciNet when reviewers content themselves with a bald description of the results with no further comments. </p>
<p>In particular, unpleasant negative comments by pseudonymous reviewers would be completely contrary to the spirit of the site. An official policy of that kind would lead to their being downvoted, or even, in extreme cases, removed.</p>
<p><em>5. Some papers might not get evaluated at all.</em></p>
<p>What happens if somebody submits a worthy but dull paper in an unfashionable area? I&#8217;ve had to referee many such papers and my heart sinks every time, because I just don&#8217;t really know, and, worse still, don&#8217;t much care, how good they are. And yet people&#8217;s careers may depend on such papers being accepted by a reasonable journal. Under the new system, would anybody be prepared to look at them?</p>
<p>I have two suggestions for this. One is that the longer a paper has gone unresponded to, the more reputation one would earn for being ready to respond. This could either be done by increasing the reputation you get for each upvote, or it could be done by having badges for, say, getting a response upvoted five times to a paper that had been unresponded to for over a year. In the first case, the premium could carry on and on increasing, making it less and less likely that a paper would languish unread.</p>
<p>The second suggestion is that it would also be part of the way the site worked that a response to a paper could fall short of a full referee&#8217;s report. Indeed, I&#8217;ve already more or less said that. One of the difficult aspects of the current system is that one has to do a complete job before saying anything at all. With the online system, one could write a response such as, &#8220;Lemma 2.1 is already known: see such-and-such a paper,&#8221; or &#8220;The proof can be simplified if you use &#8230;&#8221; That would be useful information for the author and for other readers of the paper, and would take a lot less work than reading through and evaluating the whole thing. And that would make it much more likely that people would be prepared to respond to papers that weren&#8217;t earth shattering. </p>
<p><em>6. Some papers by well-established people might be more favourably reviewed than equally good papers by less well-established people.</em></p>
<p>Possibly true, but it&#8217;s equally true of the current system. Also, if you write a report praising a paper to the skies but not giving a good reason for your praise, then your report won&#8217;t (or shouldn&#8217;t) get those precious upvotes.</p>
<p>Incidentally, I wouldn&#8217;t rule out an iterative system for reputation, where upvotes from people with high reputation count for more than upvotes from people with low reputation. This might guard against &#8220;cartels&#8221; of people from some tiny subarea all being incredibly positive about each other&#8217;s papers.</p>
<p><em>7. If somebody wrote an unjustly negative evaluation, it would be read by everybody who looked up the paper, thereby unfairly damaging the reputation of the author.</em></p>
<p>The author would have the right to respond to responses, and if they could explain convincingly why the negative evaluation (which in any case is discouraged) was wrong, then that evaluation would get plenty of downvotes and the author&#8217;s reputation would remain intact.</p>
<p><em>8. If someone gets a paper published in a good journal, that really means something. But if all they got was a few comments on a website, that wouldn&#8217;t be the same sort of stamp of quality or certificate of correctness.</em></p>
<p>Let&#8217;s think for a moment what getting published in a journal gives us now. To start with correctness, opinions differ about whether a referee is obliged to check carefully what an author writes. But the very fact that opinions differ is enough to tell us that the mere fact that a paper is accepted in a journal is <em>not</em>, on its own, a guarantee of correctness. At best it makes it a safer bet that the paper is correct. Most of the time, what makes one feel that an important result is &#8220;truly established&#8221; is, these days, not the fact that it eventually appears in a good journal, but the fact that it is read and understood by people in the field, who are likely to be able to sense if there is something fishy about the proof. If the result is sufficiently interesting, then that tends to happen long before the paper comes out.</p>
<p>With an online system of the kind I&#8217;m talking about, one could have more transparency about this process. Someone could write a response saying, &#8220;I haven&#8217;t checked this line by line, but the paper is based on a very nice idea, which I basically understand, and it&#8217;s quite clear that the details can be made to work.&#8221; That&#8217;s how I feel, for example, about some of Tom Sanders&#8217;s recent breakthroughs. Somebody else might go further and actually explain the idea. And someone else might say, &#8220;I&#8217;ve checked the first section carefully and apart from this minor mistake, which can easily be fixed, it&#8217;s correct.&#8221; And so on. (One would of course trust that kind of comment more if the person making it had plenty of reputation points. And if they vouched for the correctness of results that turned out to be wrong, then they would lose points.)</p>
<p>And how about the stamp of quality? Nowadays, if a paper is published in Annals or Inventiones, we think it must be pretty good, whereas if it is published in [insert name of your favourite not very good journal here], then we don&#8217;t. But is that all we do? It certainly gives us <em>some</em> indication of the quality of a paper, but it is a very imperfect measure. Nobody would say that every paper that appears in Annals is better than every paper that appears in, say, the Journal of the LMS, even if it is supposedly a better journal. So at best the stamp of quality is a rather crude measure.</p>
<p>Why do we need this measure? One of the main reasons is that we find ourselves having to judge other mathematicians, notably when hiring them. We get their CVs and publication lists, and we have a look through the latter to see what kinds of journals they are getting their papers published in. But if they are lucky and get an indifferent paper accepted by a very good journal, it will be very difficult for anyone but an expert to know this &#8212; and the unfortunate fact is that we often have to judge people who work in areas we don&#8217;t know much about. We also have only a rather vague idea of the relative quality of many journals. The one document that could really help us, the referee&#8217;s report, is strictly private.</p>
<p>Under the system I&#8217;m describing, we would have something potentially far more useful: not just the information of where in some very fuzzy quality scale the paper appears, but <em>actual descriptions</em> of what the paper has accomplished, how it fits into the general aims of the area, whether the techniques are standard, what is truly new, whether the paper is exciting and unexpected, and so on. If you were contemplating hiring somebody, you&#8217;d have a ready made reference, written by a wide variety of people, before you even began.</p>
<p>I wondered whether it would be a good idea to have scores for papers. One could for instance vouch for its probable correctness by ticking a box, and perhaps give it marks on a scale of 1 to 5 for quality. (There would be guidelines: for instance, 5 might be, &#8220;a breakthrough result worthy of one of the top two or three journals&#8221; and 1 might be, &#8220;of interest to specialists in a tiny area only&#8221;.) Perhaps one could allow a digit after the decimal point too. There would be some pressure to give the right judgments, since if you didn&#8217;t then you&#8217;d risk getting downvoted. And the paper could get some kind of average score based not just on how many people gave it what score, but also on their reputations and the votes given to their responses.</p>
<p>I think something like that could be made to work, but I also wonder whether a more radical approach is simply to do away with this linear-scale idea (which I think is implicit in the current system) and satisfy ourselves with the opinions given in the responses. It would be a bit like the difference between an innocent/guilty type verdict and a <a href="http://en.wikipedia.org/wiki/Narrative_verdict">narrative verdict</a>. Instead of reducing the paper to a number, you&#8217;d have judgments and descriptions that would tell you all about the paper and give you a much more detailed idea of how good it was.</p>
<p>I&#8217;m torn on this point. I&#8217;d prefer not to have numbers, but it might in the end be necessary for the purposes of pleasing bureaucrats in other subjects who need what in this country are called &#8220;metrics&#8221;. (Roughly translated, that means numerical measures of quality.)</p>
<hr />
<p>After that discussion, let me collect together what I see as the main features of this hypothetical paper-evaluation site.</p>
<p>(1) You post your papers on the arXiv and create pages for them on the site.</p>
<p>(2) People can respond to your papers. Responses can range from smallish comments to detailed descriptions and evaluations (the latter being quite similar to referees&#8217; reports as they are now).</p>
<p>(3) Responses can be written under your own name or under a pseudonym.</p>
<p>(4) You can accrue reputation as a result of responses of either kind, but your pseudonym will have the reputation disguised enough to maintain your anonymity. </p>
<p>(5) Negative language is strongly discouraged. If a paper is uninteresting, it simply doesn&#8217;t attract much interest. If it is incorrect, one says so politely. </p>
<p>(6) There is a reputation premium for evaluating papers that have spent a long time not evaluated. (There would be a way of finding these: for instance, you could list all the unreviewed papers in a certain area or subarea in chronological order.)</p>
<p>(7) If you are not registered for the site, or if you are registered but had very few reputation points, then people know that you are not doing your bit for the mathematical community when it comes to the important task of evaluating the output of others. Conversely, if you have a high reputation, then people know that you are pulling your weight.</p>
<hr />
<p>I can think of a couple more potential problems. One is that cranks could put their papers on this site and perhaps gain some kind of bogus respectability &#8212; more so than they can at the moment when their papers don&#8217;t get published. I&#8217;m not too worried about that, however. In fact, I think this system would be worse for cranks than what we have at the moment, where they can put their papers on the arXiv. Under this system, they could put their papers on the site, but those papers would be followed by responses like, &#8220;I&#8217;ve checked, and as expected the supposed simple proof of Fermat&#8217;s Last Theorem is wrong. In particular, the statement of Lemma 4 is false.&#8221;</p>
<p>Another is that somebody might put a paper on the site, then register under a different name and give their own paper a very positive response. I think it ought to be possible to stop that by having some kind of barrier to registration as a reviewer. If any professional mathematician were to try it and get caught, then their (real life) reputation would take a huge knock, as people who have been caught praising their own books on Amazon have discovered.</p>
<hr />
<p>Now let me turn to the question of how one might actually convert from the current system to an online system such as the above. (By the way, I don&#8217;t insist on every last detail of the above &#8212; I&#8217;m sure it has its flaws, and I&#8217;m sure there are some good ideas that I haven&#8217;t thought of for features that the site could have.) My answer to this is simple. Somebody gets the website up and running, and people start submitting their papers to it, as well as submitting them for publication in the normal way. Then if the new method works in the way I think it should, it will gradually be realized that the extra information one gets from the fact that a paper has been accepted in a journal is close to zero, and the journal system will wither away. If it doesn&#8217;t work, then it&#8217;s the site that withers away: it goes down in history as yet another online initiative that failed to take off. </p>
<p>In theory, it would be possible to use the site to form something like &#8220;virtual journals&#8221;. If you wanted to set up a virtual journal in additive combinatorics, say, then people could ask you to give an official stamp of approval to their paper. The editor would find a referee and instruct them to write a report. If the editor felt that the report was strong enough, then they would post the report under the journal&#8217;s name, stating clearly that it had been accepted as up to the standards required of the Journal of Additive Combinatorics. These &#8220;journals&#8221; could then accrue reputation and be useful collections of papers.</p>
<p>Having said that, the current business model for journals would no longer work, as it would no longer be possible to make money from publishing mathematical articles. While many people would rejoice at the thought of Elsevier not making money out of their papers, there would also be less welcome side-effects. For example, the London Mathematical Society makes a lot of money from its journals, to the benefit of British mathematics. I should probably mark that as another disadvantage of an online system, but if such a system would be better in all other respects, then I think a more sensible response would be to try to come up with new ways of funding learned societies. For example, the money that universities currently spend on keeping their libraries stocked could be given in part to those societies instead.</p>
<p>One final difficulty is that the site would probably be complicated and big enough to need active maintenance, so it would cost money to run. But it could in principle do such a great service to the mathematical community that obtaining the funding should be possible if a proper proposal, perhaps including a beta version of the site itself, could be presented.</p>
<hr />
<p>Postscript: I&#8217;ve just Googled &#8220;The future of mathematical publishing&#8221; and come up with some interesting links. Here is <a href="http://www.istl.org/98-fall/article2.html">what Rob Kirby had to say in 1998</a>. Last February there was a conference at MSRI on mathematical publishing, which produced <a href="http://www.msri.org/attachments/workshops/587/MSRIfinalreport.pdf">this interesting document</a>. (One thing that interests me about the document is that the prevailing attitude seems to be one of thinking about how the mathematics journal will survive in the new and changed world we now inhabit, rather than thinking about what a system for evaluating research would look like if one were to design it from scratch.) And here&#8217;s one more link: a paper called <a href="http://www.fi.muni.cz/usr/sojka/download/dml2008/20.pdf">Some thoughts on the near-future digital mathematical library</a>, by Thierry Bouche.</p>
<p>Also, I strongly recommend <a href="http://michaelnielsen.org/blog/is-scientific-publishing-about-to-be-disrupted/">this post by Michael Nielsen</a>, which has influenced my thinking on these matters.</p>
<hr />
<p>I forgot to mention one other aspect of an online system that would in my opinion be a huge advantage. It&#8217;s that there are many kinds of mathematical communication that do not get published in journals because they are &#8220;the wrong kind of thing&#8221;. For example, if you write an essay about why certain promising approaches to a problem don&#8217;t in fact work, then unless you can express it precisely in the form of a mathematical theorem (a famous example of this is Razborov and Rudich&#8217;s Natural Proofs paper), then you won&#8217;t get it published. Similarly, if you rewrite an existing proof in a much clearer way, you won&#8217;t get that published either, even if what you&#8217;ve done is far more useful to other mathematicians than yet another obscure theorem. On this site, you could submit mathematical documents of all kinds, perhaps giving them labels such as &#8220;new result&#8221; or &#8220;exposition of existing result&#8221; or &#8220;discussion of open problem&#8221;. If these were good enough, you would get glowing responses to them. Perhaps this could lead to a cultural shift towards giving more value to mathematical activities other than proving theorems.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3645/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3645/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3645/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3645/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3645/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3645/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3645/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3645/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3645/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3645/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3645/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3645/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3645/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3645/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3645&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-model-of-mathematical-publishing/feed/</wfw:commentRss>
		<slash:comments>118</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Equivalence relations</title>
		<link>http://gowers.wordpress.com/2011/10/30/equivalence-relations/</link>
		<comments>http://gowers.wordpress.com/2011/10/30/equivalence-relations/#comments</comments>
		<pubDate>Sun, 30 Oct 2011 10:40:40 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[General concepts]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3623</guid>
		<description><![CDATA[Equivalence relations are in a way a fairly simple mathematical concept. After all, it&#8217;s not that hard to learn what reflexive, symmetric and transitive mean and to remember that if you&#8217;ve got all three properties then you&#8217;ve got an equivalence relation. However, equivalence relations do still cause one or two difficulties. One, which I&#8217;ll discuss [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3623&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Equivalence relations are in a way a fairly simple mathematical concept. After all, it&#8217;s not that hard to learn what reflexive, symmetric and transitive mean and to remember that if you&#8217;ve got all three properties then you&#8217;ve got an equivalence relation. However, equivalence relations do still cause one or two difficulties. One, which I&#8217;ll discuss first, is that many people find that the proof that equivalence classes always form a partition is rather complicated and hard to remember. The other is that equivalence relations are closely connected to the notion of quotients, which appear in many places in mathematics, and which many people find quite hard to grasp.<br />
<span id="more-3623"></span></p>
<p><strong>Proving that equivalence classes form a partition.</strong></p>
<p>I&#8217;d like to try to demonstrate that this longish argument is one that does not need to be memorized, as long as you are in the habit of applying a couple of standard proof-generating techniques that I have already mentioned in these posts. But first I should say more precisely what it is we are trying to prove.</p>
<p><strong>Proposition.</strong> <em>Let <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> be a set and let <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> be an equivalence relation on <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />. Then the equivalence classes of <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> form a partition of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />.</em></p>
<p><strong>Finding a proof.</strong> </p>
<p>1. I shall assume that you can remember the formal definition of an equivalence relation. If you can&#8217;t, then look it up in your notes or <a href="http://en.wikipedia.org/wiki/Equivalence_relation">on Wikipedia</a>.</p>
<p>2. I&#8217;m not going to assume that you know the formal definition of partition. That&#8217;s because, if people I&#8217;ve taught over the years are anything to go by, there is a good chance that you have a reasonable mental picture of what it is, but are a little less sure of how to turn that picture into a formal definition. </p>
<p>What, then, is a partition of a set <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />? Informally, I want to say that it is a collection of disjoint subsets of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> that &#8220;cover the whole of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />&#8220;. But how do we write that out properly? </p>
<p>The first thing we should do is give a name to the collection of sets. Since it&#8217;s a partition we are defining, the letter p seems appropriate. But also, what we&#8217;ve got here is a set of sets (part of the reason the concept is ever so slightly tricky), and it is nice to have a notation that distinguishes it from elements (usually denoted by lower-case letters) and sets (usually denoted by upper case letters). So I&#8217;m going to use a curly p, looking like this: <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BP%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{P}' title='&#92;mathcal{P}' class='latex' />.</p>
<p>Note that <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BP%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{P}' title='&#92;mathcal{P}' class='latex' /> is a set whose elements are sets. To distinguish between set-of-sets kinds of sets and set-of-elements kinds of sets I myself like to call <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BP%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{P}' title='&#92;mathcal{P}' class='latex' /> a <em>collection</em> of sets rather than a set of sets. </p>
<p>How do we say in precise language that the sets in <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BP%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{P}' title='&#92;mathcal{P}' class='latex' /> are disjoint? One&#8217;s first thought might be to say, &#8220;Any two sets in <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BP%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{P}' title='&#92;mathcal{P}' class='latex' /> are disjoint.&#8221; However, there is a little piece of necessary pedantry that you have to add to that formulation. It should be, &#8220;Any two <em>distinct</em> sets in <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BP%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{P}' title='&#92;mathcal{P}' class='latex' /> are disjoint.&#8221; But surely, you might protest, if one says &#8220;Any two sets&#8221; then one is talking about <em>two</em> sets, so of course they are distinct. I think the reason we insist so strongly on this point is that if you decide to write the statement out even more formally, it will look something like this:</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cforall+X%2CY%5Cin%5Cmathcal%7BP%7D%5C+%5C+X%5Cne+Y%5Cimplies+X%5Ccap+Y%3D%5Cemptyset&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall X,Y&#92;in&#92;mathcal{P}&#92; &#92; X&#92;ne Y&#92;implies X&#92;cap Y=&#92;emptyset' title='&#92;forall X,Y&#92;in&#92;mathcal{P}&#92; &#92; X&#92;ne Y&#92;implies X&#92;cap Y=&#92;emptyset' class='latex' /></p>
<p>If we write it like that but omitting the condition that <img src='http://s0.wp.com/latex.php?latex=X%5Cne+Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X&#92;ne Y' title='X&#92;ne Y' class='latex' />, then we end up with an incorrect definition, because there really is nothing to stop <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> equalling <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' />.</p>
<p>How do we say precisely that the sets in <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BP%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{P}' title='&#92;mathcal{P}' class='latex' /> &#8220;cover&#8221; <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />? Well, that says that every element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> belongs to at least one set in <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BP%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{P}' title='&#92;mathcal{P}' class='latex' />. In symbols, it says,</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cforall+a%5Cin+A%5C+%5Cexists+X%5Cin%5Cmathcal%7BP%7D%5C+%5C+a%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall a&#92;in A&#92; &#92;exists X&#92;in&#92;mathcal{P}&#92; &#92; a&#92;in X' title='&#92;forall a&#92;in A&#92; &#92;exists X&#92;in&#92;mathcal{P}&#92; &#92; a&#92;in X' class='latex' />.</p>
<p>Of course, in combination with the disjointness property, that tells us that every element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> belongs to <em>exactly</em> one set in <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BP%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{P}' title='&#92;mathcal{P}' class='latex' />, which in fact gives us a concise definition of what it means for <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BP%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{P}' title='&#92;mathcal{P}' class='latex' /> to be a partition. </p>
<p>By this time you might be reminded of the definitions of injection and surjection: we need every element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> to be in <em>at most one</em> set in <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BP%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{P}' title='&#92;mathcal{P}' class='latex' /> (the disjointness conditions) and <em>at least one</em> set in <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BP%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{P}' title='&#92;mathcal{P}' class='latex' /> (the covering condition). Is this just an amusing echo or is there a closer relationship? As always in mathematics, the answer is the latter: one can find a general framework that incorporates both. But I&#8217;ll leave that to you to think about if you feel like it.</p>
<p>Something else you might be wondering is why I went to the bother of writing <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BP%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{P}' title='&#92;mathcal{P}' class='latex' /> and worrying about collections of sets. Why didn&#8217;t I just call the sets in the partition <img src='http://s0.wp.com/latex.php?latex=X_1%2CX_2%2C%5Cdots%2CX_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X_1,X_2,&#92;dots,X_n' title='X_1,X_2,&#92;dots,X_n' class='latex' />? The answer is that if I had done that I would have inadvertently been making the assumption that there are only finitely many sets in the partition, and that doesn&#8217;t have to be the case. OK, you might counter, what about calling them <img src='http://s0.wp.com/latex.php?latex=X_1%2CX_2%2CX_3%2C%5Cdots&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X_1,X_2,X_3,&#92;dots' title='X_1,X_2,X_3,&#92;dots' class='latex' /> in the infinite case? Well, not only does that force me to deal with two cases, which is ugly, but it also assumes that there are only countably many sets in the partition (a statement you won&#8217;t understand until you have covered countability in Numbers and Sets), which again needn&#8217;t be the case. So the curly-P notation, though not the only way of doing things, was chosen for pretty good reasons.</p>
<p>3. Right, we&#8217;ve been given a set <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and an equivalence relation <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> on <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and we want to prove that the equivalence classes of <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> form a partition of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />. So we&#8217;ve got a bunch of sets, the equivalence classes of <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' />, and we must prove that it has the properties required of a partition. </p>
<p>I&#8217;m going to do a step that after a small amount of practice one can do without. I shall start the proof as follows.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BE%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{E}' title='&#92;mathcal{E}' class='latex' /> be the collection of equivalence classes of <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' />.</li>
<p>All I&#8217;ve done here is give a name to the collection of sets that we hope to prove is a partition of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />.</p>
<p>4. Now we can be very mechanical. We must prove that for every <img src='http://s0.wp.com/latex.php?latex=a%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in A' title='a&#92;in A' class='latex' /> there is some <img src='http://s0.wp.com/latex.php?latex=E%5Cin%5Cmathcal%7BE%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E&#92;in&#92;mathcal{E}' title='E&#92;in&#92;mathcal{E}' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=a%5Cin+E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in E' title='a&#92;in E' class='latex' />, and we must prove that if <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=F&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F' title='F' class='latex' /> are distinct sets in <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BE%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{E}' title='&#92;mathcal{E}' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=F&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F' title='F' class='latex' /> are disjoint.</p>
<p>5. Next comes a very important principle: if you&#8217;ve defined a set <img src='http://s0.wp.com/latex.php?latex=S&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S' title='S' class='latex' /> via some property <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> and you have the statement <img src='http://s0.wp.com/latex.php?latex=x%5Cin+S&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in S' title='x&#92;in S' class='latex' /> then it is usually much better to rewrite this statement as <img src='http://s0.wp.com/latex.php?latex=P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(x)' title='P(x)' class='latex' />. That&#8217;s not very clearly expressed, but here&#8217;s an example. What does it mean to say that <img src='http://s0.wp.com/latex.php?latex=E%5Cin%5Cmathcal%7BE%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E&#92;in&#92;mathcal{E}' title='E&#92;in&#92;mathcal{E}' class='latex' />? Well, <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BE%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{E}' title='&#92;mathcal{E}' class='latex' /> is the set of all subsets of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> that are equivalence classes of <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' />. So the statment <img src='http://s0.wp.com/latex.php?latex=E%5Cin%5Cmathcal%7BE%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E&#92;in&#92;mathcal{E}' title='E&#92;in&#92;mathcal{E}' class='latex' /> can be rephrased as &#8220;<img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> is an equivalence class of <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' />&#8220;. That&#8217;s what I meant earlier by saying that the step of defining <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BE%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{E}' title='&#92;mathcal{E}' class='latex' /> wasn&#8217;t really necessary: we can get away with simply talking about equivalence classes. I call this the getting-rid-of-sets trick. Here, we apply it to get rid of <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BE%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{E}' title='&#92;mathcal{E}' class='latex' />.</p>
<p>6. What, by the way, is an equivalence class? Well, the equivalence class of an element <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is defined to be <img src='http://s0.wp.com/latex.php?latex=%5C%7Bb%5Cin+A%3Aa%5Csim+b%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{b&#92;in A:a&#92;sim b&#92;}' title='&#92;{b&#92;in A:a&#92;sim b&#92;}' class='latex' />, which I shall denote by <img src='http://s0.wp.com/latex.php?latex=E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)' title='E(a)' class='latex' />. That is, it is the set of all elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> that are related to <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />. </p>
<p>Using this, let&#8217;s go back to the first of the two statements we wanted to prove. It was this.</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=a%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in A' title='a&#92;in A' class='latex' /> there is some <img src='http://s0.wp.com/latex.php?latex=E%5Cin%5Cmathcal%7BE%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E&#92;in&#92;mathcal{E}' title='E&#92;in&#92;mathcal{E}' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=a%5Cin+E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in E' title='a&#92;in E' class='latex' />.</li>
<p>We can begin by translating this in a way that gets rid of <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BE%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{E}' title='&#92;mathcal{E}' class='latex' />.</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=a%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in A' title='a&#92;in A' class='latex' /> there is some equivalence class <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=a%5Cin+E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in E' title='a&#92;in E' class='latex' />.</li>
<p>Next, we can do a further translation that takes account of what equivalence classes are.</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=a%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in A' title='a&#92;in A' class='latex' /> there is some element <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=a%5Cin+E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in E(b)' title='a&#92;in E(b)' class='latex' />.</li>
<p>(In words, for every <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> there is a <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> belongs to the equivalence class of <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' />.)</p>
<p>What could we choose as our <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' />? Here we are in a position a bit like that of a chess player who has only one move that gets out of check. We know absolutely nothing about elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> except that <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> is one of them. So we&#8217;d better try <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> itself. Is it true that <img src='http://s0.wp.com/latex.php?latex=a%5Cin+E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in E(a)' title='a&#92;in E(a)' class='latex' />? Well, what does that mean? </p>
<p>The getting-rid-of-sets principle applies again. <img src='http://s0.wp.com/latex.php?latex=E%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(x)' title='E(x)' class='latex' /> is defined to be <img src='http://s0.wp.com/latex.php?latex=%5C%7By%5Cin+A%3Ax%5Csim+y%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{y&#92;in A:x&#92;sim y&#92;}' title='&#92;{y&#92;in A:x&#92;sim y&#92;}' class='latex' />. So the statement <img src='http://s0.wp.com/latex.php?latex=u%5Cin+E%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u&#92;in E(x)' title='u&#92;in E(x)' class='latex' /> is equivalent to the statement <img src='http://s0.wp.com/latex.php?latex=x%5Csim+u&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;sim u' title='x&#92;sim u' class='latex' />. Applying that with <img src='http://s0.wp.com/latex.php?latex=u%3Dx%3Da&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u=x=a' title='u=x=a' class='latex' /> we find that the statement <img src='http://s0.wp.com/latex.php?latex=a%5Cin+E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in E(a)' title='a&#92;in E(a)' class='latex' /> is equivalent to the statement <img src='http://s0.wp.com/latex.php?latex=a%5Csim+a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;sim a' title='a&#92;sim a' class='latex' />. And that is true since <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> is reflexive.</p>
<p>I took a long time over explaining that, but that is merely because I&#8217;m trying to spell out every minute detail of the thought process. What one would actually write for the entire proof so far is this.</p>
<li>We must first prove that every element <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> belongs to some equivalence class. But <img src='http://s0.wp.com/latex.php?latex=a%5Cin+E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in E(a)' title='a&#92;in E(a)' class='latex' />, since <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> is reflexive.</li>
<p>7. We now come to the harder part of the proof, which is to show that distinct sets in <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BE%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{E}' title='&#92;mathcal{E}' class='latex' />, in other words, distinct equivalence classes, are disjoint.</p>
<p>Again, we want to remember that an equivalence class is an equivalence class <em>of something</em>. So we could formulate our task as that of showing the following statement: for every pair of elements <img src='http://s0.wp.com/latex.php?latex=a%2Cb&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a,b' title='a,b' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />, if <img src='http://s0.wp.com/latex.php?latex=E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)' title='E(a)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(b)' title='E(b)' class='latex' /> are distinct sets, then <img src='http://s0.wp.com/latex.php?latex=E%28a%29%5Ccap+E%28b%29%3D%5Cemptyset&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)&#92;cap E(b)=&#92;emptyset' title='E(a)&#92;cap E(b)=&#92;emptyset' class='latex' />. </p>
<p>I&#8217;ve noted that this is somewhat similar to proving that a function is an injection, which suggests that it might be nicer to prove the contrapositive. Another sign is that as things stand at the moment we are trying to deduce a &#8220;negative&#8221; statement from a &#8220;negative&#8221; statement: that if <img src='http://s0.wp.com/latex.php?latex=E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)' title='E(a)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(b)' title='E(b)' class='latex' /> are not the same, then they do not have any elements in common. It feels cleaner somehow to try to prove that if <img src='http://s0.wp.com/latex.php?latex=E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)' title='E(a)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(b)' title='E(b)' class='latex' /> <em>do</em> have an element in common, then they must be the same set.</p>
<p>It&#8217;s just possible that you have been accidentally confusing equivalence classes with <em>names</em> of equivalence classes, and therefore thinking that <img src='http://s0.wp.com/latex.php?latex=E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)' title='E(a)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(b)' title='E(b)' class='latex' /> are distinct equivalence classes if and only if <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> are distinct elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />. But that is very much not the case, or equivalence relations wouldn&#8217;t be very interesting. Let us briefly remind ourselves what it means to say that <img src='http://s0.wp.com/latex.php?latex=E%28a%29%3DE%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)=E(b)' title='E(a)=E(b)' class='latex' />. </p>
<p>Isn&#8217;t that just obvious? Doesn&#8217;t it just mean that they are the same set? Well, yes, but <img src='http://s0.wp.com/latex.php?latex=E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)' title='E(a)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(b)' title='E(b)' class='latex' /> are not defined in the same way, so how are we supposed to <em>prove</em> that they are the same set? The answer is to use the <em>definition of set equality</em>. (This is a very important general principle, and not just one that applies to this particular proof.)</p>
<li>Two sets <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> are equal if every element of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> and every element of <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />.</li>
<p>8. Having decided all that, where are we? Our aim is to prove that for every pair of elements <img src='http://s0.wp.com/latex.php?latex=a%2Cb%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a,b&#92;in A' title='a,b&#92;in A' class='latex' />, if <img src='http://s0.wp.com/latex.php?latex=E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)' title='E(a)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(b)' title='E(b)' class='latex' /> have an element in common, then <img src='http://s0.wp.com/latex.php?latex=E%28a%29%3DE%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)=E(b)' title='E(a)=E(b)' class='latex' />. So let&#8217;s start writing a bit more of the proof. First, I&#8217;ll set up the assumptions we get to use, using the &#8220;let&#8221; and &#8220;suppose&#8221; tricks, and also giving a name to the element that <img src='http://s0.wp.com/latex.php?latex=E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)' title='E(a)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(b)' title='E(b)' class='latex' /> have in common. Note that all these moves are, or at any rate should be, pure reflexes.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> be elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and suppose that <img src='http://s0.wp.com/latex.php?latex=c%5Cin+E%28a%29%5Ccap+E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c&#92;in E(a)&#92;cap E(b)' title='c&#92;in E(a)&#92;cap E(b)' class='latex' />.</li>
<p>9. Our aim now is to show that <img src='http://s0.wp.com/latex.php?latex=E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)' title='E(a)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(b)' title='E(b)' class='latex' /> are the same set. So we must show that every element of <img src='http://s0.wp.com/latex.php?latex=E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)' title='E(a)' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(b)' title='E(b)' class='latex' /> and that every element of <img src='http://s0.wp.com/latex.php?latex=E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(b)' title='E(b)' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)' title='E(a)' class='latex' />. By symmetry it&#8217;s enough to do just one direction. (More precisely, once we&#8217;ve done the first direction, we can see that just switching round the letters <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> will do the other direction, so we don&#8217;t bother.) Since we&#8217;re now trying to prove a statement that begins with the word &#8220;every&#8221;, we apply the &#8220;let&#8221; trick and write this.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=d%5Cin+E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d&#92;in E(a)' title='d&#92;in E(a)' class='latex' />.</li>
<p>10. Next, we apply the getting-rid-of-sets trick that I&#8217;ve mentioned. We have two statements that say that certain elements belong to certain sets defined by properties. They are</p>
<li><img src='http://s0.wp.com/latex.php?latex=c%5Cin+E%28a%29%5Ccap+E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c&#92;in E(a)&#92;cap E(b)' title='c&#92;in E(a)&#92;cap E(b)' class='latex' /></li>
<p>and </p>
<li><img src='http://s0.wp.com/latex.php?latex=d%5Cin+E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d&#92;in E(a)' title='d&#92;in E(a)' class='latex' />.</li>
<p>By the definitions of <img src='http://s0.wp.com/latex.php?latex=E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(a)' title='E(a)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E(b)' title='E(b)' class='latex' /> these statements allow us to write the following addition to our proof.</p>
<li>Then <img src='http://s0.wp.com/latex.php?latex=a%5Csim+c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;sim c' title='a&#92;sim c' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=b%5Csim+c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim c' title='b&#92;sim c' class='latex' />, and <img src='http://s0.wp.com/latex.php?latex=a%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;sim d' title='a&#92;sim d' class='latex' />.</li>
<p>11. We shouldn&#8217;t at this stage forget what our aim is. We&#8217;re assuming that <img src='http://s0.wp.com/latex.php?latex=d%5Cin+E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d&#92;in E(a)' title='d&#92;in E(a)' class='latex' /> and want to prove that <img src='http://s0.wp.com/latex.php?latex=d%5Cin+E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d&#92;in E(b)' title='d&#92;in E(b)' class='latex' />. That&#8217;s equivalent to proving that <img src='http://s0.wp.com/latex.php?latex=b%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim d' title='b&#92;sim d' class='latex' />. So there it is, we&#8217;re allowed to assume that <img src='http://s0.wp.com/latex.php?latex=a%5Csim+c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;sim c' title='a&#92;sim c' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=b%5Csim+c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim c' title='b&#92;sim c' class='latex' />, and <img src='http://s0.wp.com/latex.php?latex=a%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;sim d' title='a&#92;sim d' class='latex' />, and we need to prove that <img src='http://s0.wp.com/latex.php?latex=b%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim d' title='b&#92;sim d' class='latex' />. And we&#8217;re given that <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> is reflexive, symmetric, and transitive.</p>
<p>12. At this point the proof becomes quite easy. The symmetry tells us that we can always replace <img src='http://s0.wp.com/latex.php?latex=x%5Csim+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;sim y' title='x&#92;sim y' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=y%5Csim+x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;sim x' title='y&#92;sim x' class='latex' /> and the transitivity tells us that we&#8217;re trying to find a chain from <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> of &#8220;<img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> links&#8221;. The only statement we&#8217;ve got about <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> is that <img src='http://s0.wp.com/latex.php?latex=b%5Csim+c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim c' title='b&#92;sim c' class='latex' />, so that&#8217;s how our chain will have to start. Then we use the fact that <img src='http://s0.wp.com/latex.php?latex=a%5Csim+c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;sim c' title='a&#92;sim c' class='latex' />, which we flip round so that it says <img src='http://s0.wp.com/latex.php?latex=c%5Csim+a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c&#92;sim a' title='c&#92;sim a' class='latex' />. And finally, we know that <img src='http://s0.wp.com/latex.php?latex=a%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;sim d' title='a&#92;sim d' class='latex' />. Putting those three facts together and using transitivity, we get that <img src='http://s0.wp.com/latex.php?latex=b%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim d' title='b&#92;sim d' class='latex' />.</p>
<p>13. Strictly speaking I had to use transitivity <em>twice</em> there. For example, from <img src='http://s0.wp.com/latex.php?latex=b%5Csim+c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim c' title='b&#92;sim c' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=c%5Csim+a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c&#92;sim a' title='c&#92;sim a' class='latex' /> I can deduce that <img src='http://s0.wp.com/latex.php?latex=b%5Csim+a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim a' title='b&#92;sim a' class='latex' />, and from that and <img src='http://s0.wp.com/latex.php?latex=a%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;sim d' title='a&#92;sim d' class='latex' /> I can deduce that <img src='http://s0.wp.com/latex.php?latex=b%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim d' title='b&#92;sim d' class='latex' />. However, once one gets used to transitivity, one just knows that if <img src='http://s0.wp.com/latex.php?latex=R&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R' title='R' class='latex' /> is a transitive relation and <img src='http://s0.wp.com/latex.php?latex=x_0Rx_1%2C+x_1Rx_2%2C%5Cdots%2Cx_%7Bk-1%7DRx_k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x_0Rx_1, x_1Rx_2,&#92;dots,x_{k-1}Rx_k' title='x_0Rx_1, x_1Rx_2,&#92;dots,x_{k-1}Rx_k' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=x_0Rx_k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x_0Rx_k' title='x_0Rx_k' class='latex' />. (As an easy exercise, you might like to give a formal inductive proof of this statement.)</p>
<p>14. In case it&#8217;s not clear, the proof is finished. To the write-up I should add the following lines.</p>
<li>It follows that <img src='http://s0.wp.com/latex.php?latex=b%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim d' title='b&#92;sim d' class='latex' />, and hence that <img src='http://s0.wp.com/latex.php?latex=d%5Cin+E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d&#92;in E(b)' title='d&#92;in E(b)' class='latex' />. Similarly, if <img src='http://s0.wp.com/latex.php?latex=d%5Cin+E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d&#92;in E(b)' title='d&#92;in E(b)' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=d%5Cin+E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d&#92;in E(a)' title='d&#92;in E(a)' class='latex' />.</li>
<p>Let&#8217;s collect together all the lines of proof and see what it looks like. I&#8217;ll add a little bit of padding too.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BE%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{E}' title='&#92;mathcal{E}' class='latex' /> be the collection of equivalence classes of <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' />. We must first prove that every element <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> belongs to some element of <img src='http://s0.wp.com/latex.php?latex=%5Cmathcal%7BE%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathcal{E}' title='&#92;mathcal{E}' class='latex' /> &#8212; that is, to some equivalence class. But <img src='http://s0.wp.com/latex.php?latex=a%5Cin+E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in E(a)' title='a&#92;in E(a)' class='latex' />, since <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> is reflexive.
<p>Now let <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> be elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and suppose that <img src='http://s0.wp.com/latex.php?latex=c%5Cin+E%28a%29%5Ccap+E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c&#92;in E(a)&#92;cap E(b)' title='c&#92;in E(a)&#92;cap E(b)' class='latex' />. Let <img src='http://s0.wp.com/latex.php?latex=d%5Cin+E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d&#92;in E(a)' title='d&#92;in E(a)' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=a%5Csim+c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;sim c' title='a&#92;sim c' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=b%5Csim+c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim c' title='b&#92;sim c' class='latex' />, and <img src='http://s0.wp.com/latex.php?latex=a%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;sim d' title='a&#92;sim d' class='latex' />. It follows by symmetry that <img src='http://s0.wp.com/latex.php?latex=b%5Csim+c%2C+c%5Csim+a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim c, c&#92;sim a' title='b&#92;sim c, c&#92;sim a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=a%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;sim d' title='a&#92;sim d' class='latex' />. Hence, by transitivity, <img src='http://s0.wp.com/latex.php?latex=b%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim d' title='b&#92;sim d' class='latex' />. Therefore, <img src='http://s0.wp.com/latex.php?latex=d%5Cin+E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d&#92;in E(b)' title='d&#92;in E(b)' class='latex' />. Similarly, if <img src='http://s0.wp.com/latex.php?latex=d%5Cin+E%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d&#92;in E(b)' title='d&#92;in E(b)' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=d%5Cin+E%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d&#92;in E(a)' title='d&#92;in E(a)' class='latex' />.</p>
<p>It follows that distinct equivalence classes are disjoint. This completes the proof that the equivalence classes of <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> form a partition of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />.</li>
<p>There&#8217;s one thing I want to make very clear now. It may look as though you have to remember rather a lot of tricks for this proof, and in a sense that&#8217;s true, but <em>they are general tricks</em>. They aren&#8217;t things you have to remember for this specific proof, but rather they are things like the &#8220;let&#8221; and &#8220;suppose&#8221; tricks, the getting-rid-of-sets trick, the give-it-a-name trick, and the &#8220;by symmetry&#8221; trick. I suppose I could also add the don&#8217;t-forget-to-use-the-precise-definition trick (or else an alternative definition, if a suitable one is available). Learning those tricks is a one-off investment that pays dividends throughout mathematics, and not just for specific proofs.</p>
<p>Once you&#8217;ve got those general tricks sorted out, and have carefully learnt the definitions of &#8220;equivalence relation&#8221; and &#8220;partition&#8221;, just about the only bit of actual mathematics you have to do is seeing how to deduce that <img src='http://s0.wp.com/latex.php?latex=b%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim d' title='b&#92;sim d' class='latex' /> from the fact that <img src='http://s0.wp.com/latex.php?latex=a%5Csim+c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;sim c' title='a&#92;sim c' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=b%5Csim+c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;sim c' title='b&#92;sim c' class='latex' />, and <img src='http://s0.wp.com/latex.php?latex=a%5Csim+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;sim d' title='a&#92;sim d' class='latex' />, using the information that <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> is symmetric and transitive. And that, I hope you&#8217;ll agree, is a pretty easy problem.</p>
<hr />
<p><strong>What is the point of equivalence relations?</strong></p>
<p>Now that that&#8217;s sorted out, let me turn to another question: why are equivalence relations important? I&#8217;ll give two answers to this, but I&#8217;m sure there are many more.</p>
<p>The first answer is that we often find ourselves not interested in certain distinctions. For example, if I am doing geometry, then I might want to talk about an equilateral triangle of side length 1. If someone were to say to me, &#8220;There are uncountably many equilateral triangles of side length 1. Which one are you talking about?&#8221; I would respond by saying that it doesn&#8217;t matter: nothing that I&#8217;m about to say depends on which equilateral triangle of side length 1 I&#8217;m going to choose.</p>
<p>If one tries to make precise this idea of &#8220;not mattering&#8221;, then one soon finds oneself talking about the relation of congruence. We often regard two shapes as &#8220;essentially the same&#8221; if they are congruent to one another. And congruence is an equivalence relation.</p>
<p>Sometimes, we don&#8217;t really care about size either &#8212; just shape. There&#8217;s an equivalence relation for that too, namely similarity. So we might declare that we regard two shapes as &#8220;essentially the same&#8221; if they are similar to one another (in the geometrical sense that you can get from one to the other by translating, reflecting, rotating and expanding). </p>
<p>In modular arithmetic, we fix some positive integer <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and decide that we want to regard two integers as &#8220;essentially the same&#8221; if they differ by a multiple of <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />. Again, an equivalence relation underlies this: that of congruence mod <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />. (Of course, this is a different meaning of the word &#8220;congruent&#8221;.)</p>
<p>Another situation where we like to regard things as being &#8220;essentially the same&#8221; even when they are different is abstract algebra. For example, we often like to regard two groups as essentially the same if they are isomorphic. I&#8217;d like to say that &#8220;is isomorphic to&#8221; is an equivalence relation on the set of all groups, but, dammit, I&#8217;m not allowed to. Why not? Because there are so many groups that they don&#8217;t form a set. (A silly way of seeing this is to observe that if <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> is any set, then I can define a one-element group by taking the set <img src='http://s0.wp.com/latex.php?latex=%5C%7BX%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{X&#92;}' title='&#92;{X&#92;}' class='latex' /> and defining <img src='http://s0.wp.com/latex.php?latex=X%5Ccirc+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X&#92;circ X' title='X&#92;circ X' class='latex' /> to be the only thing it can be, namely <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />.)</p>
<p>Here&#8217;s a good example of the <em>dis</em>advantage of insisting that the official definition of everything has to be in terms of sets. It&#8217;s clear that every group is isomorphic to itself, that if <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is isomorphic to <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is isomorphic to <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />, and that if <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is isomorphic to <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is isomorphic to <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is isomorphic to <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' />. So &#8220;is isomorphic to&#8221; ought to be an equivalence relation, and it&#8217;s only a combination of an annoying set-theoretic paradox and an insistence that relations are &#8220;really&#8221; sets that stops it being one. If we defined relations more linguistically as something like &#8220;statements with two gaps into which you insert objects of the appropriate type&#8221;, then &#8220;is isomorphic to&#8221; is an equivalence relation. </p>
<p><strong>Quotients.</strong></p>
<p>A second important justification for equivalence relations is that they give us a very useful way of building new mathematical structures out of old ones. The basic strategy is this. First you take some structure <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />. Then you define a carefully chosen equivalence relation <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> on <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />. Finally, you show that you can use the structure on <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> to define a useful structure on the set of all the equivalence classes, which is customarily denoted by <img src='http://s0.wp.com/latex.php?latex=X%2F%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X/&#92;sim' title='X/&#92;sim' class='latex' /> and called the <em>quotient set</em>.</p>
<p>What has this use of the word &#8220;quotient&#8221; got to do with the use of the word when we are talking about division of numbers? Well, let&#8217;s suppose we are explaining division to a child. We might ask something like this. &#8220;David is sharing out 20 sweets amongst five children. How many sweets do they get each?&#8221; Assuming that David doesn&#8217;t want a riot on his hands, they will all get the same number, and that number is four. But let&#8217;s imagine the situation in more detail. What will David do with those sweets? He will divide them up into five piles, each with four sweets in it. That is, <em>he will partition those sweets</em>! We can even define an equivalence relation, namely, &#8220;is destined to be eaten by the same child as&#8221;. </p>
<p>I&#8217;m not trying to suggest here that the mathematical notion of a quotient is just as easy as the notion of dividing up some sweets into little piles. It isn&#8217;t. The reason it isn&#8217;t is the thing I talked about just a moment ago: we take quotients not just of sets but of <em>sets with structure</em>, and we like our quotients to inherit some of that structure. </p>
<p>Since this post is already quite long, and since quotients are quite a big topic, I think I&#8217;ll stop here for now and leave this short discussion as a cliffhanger.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3623/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3623/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3623/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3623/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3623/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3623/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3623/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3623/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3623/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3623/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3623/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3623/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3623/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3623/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3623&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/10/30/equivalence-relations/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Alternative definitions</title>
		<link>http://gowers.wordpress.com/2011/10/25/alternative-definitions/</link>
		<comments>http://gowers.wordpress.com/2011/10/25/alternative-definitions/#comments</comments>
		<pubDate>Mon, 24 Oct 2011 23:07:46 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[General concepts]]></category>
		<category><![CDATA[IA Groups]]></category>
		<category><![CDATA[IA Numbers and Sets]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3590</guid>
		<description><![CDATA[Something that happens very often in lecture courses is that you are presented with a definition, and soon after it you are told that a certain property is equivalent to that definition. This equivalence means that in principle one could have chosen the property as the &#8220;definition&#8221; and the definition as an equivalent property. To [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3590&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Something that happens very often in lecture courses is that you are presented with a definition, and soon after it you are told that a certain property is equivalent to that definition. This equivalence means that in principle one could have chosen the property as the &#8220;definition&#8221; and the definition as an equivalent property. To put that differently, suppose you are developing a piece of theory and have some word you want to define. To pick an imaginary example, suppose you have a notion of a set being &#8220;abundant&#8221;. Suppose that a set is defined to be abundant if it has property P, and that property P is equivalent to property Q. There may well not be much to choose between the following pair of alternatives. On the one hand you can say, &#8220;Definition: A set is <em>abundant</em> if it has property P,&#8221; and follow that with, &#8220;Proposition: A set is abundant if and only if it has property Q,&#8221; while on the other you can say, &#8220;Definition: A set is <em>abundant</em> if it has property Q,&#8221; and follow that with, &#8220;Proposition: A set is abundant if and only if it has property P.&#8221;<br />
<span id="more-3590"></span></p>
<p>That is a simple observation, and one that you would have been likely to make for yourself, or to have had pointed out to you at some stage. But it has a very important practical consequence, which I can sum up as a slogan.</p>
<li><em>You should treat equivalent properties as alternative definitions.</em></li>
<p>Rather than say in detail what I mean, I am going to discuss several examples. Some of them you will meet in Group Theory and Numbers and Sets. The others I will discuss in less detail, since you will not meet them until later. But they make good examples &#8212; perhaps you will find it useful to reread this post at some point in the future.</p>
<p><strong>1. Bijections.</strong></p>
<p>I have already touched on this example. A bijection is defined to be a function that is both an injection and a surjection. It may therefore seem obvious that if you are asked to prove that some function <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a bijection, then you should set about proving first that it is an injection and then that it is a surjection. However, that is not always true. One of the basic results about bijections is this.</p>
<p><strong>Proposition.</strong> <em>A function <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> is a bijection if and only if it has an inverse.</em></p>
<p>Once we&#8217;ve proved that proposition, we are allowed to treat &#8220;has an inverse&#8221; as an alternative definition of &#8220;is a bijection&#8221;. Let me give a very simple instance of a proof where &#8220;has an inverse&#8221; is a more convenient definition to use. Suppose you are given two sets <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> and asked to prove the easy fact that the function <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AA%5Ctimes+B%5Cto+B%5Ctimes+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:A&#92;times B&#92;to B&#92;times A' title='&#92;phi:A&#92;times B&#92;to B&#92;times A' class='latex' /> that takes <img src='http://s0.wp.com/latex.php?latex=%28a%2Cb%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,b)' title='(a,b)' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%28b%2Ca%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(b,a)' title='(b,a)' class='latex' /> is a bijection. If you work directly from the injection-surjection definition, your proof will look something like this.</p>
<p><strong>Proof 1.</strong> First let us show that <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is an injection. If <img src='http://s0.wp.com/latex.php?latex=%28b%2Ca%29%3D%28b%27%2Ca%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(b,a)=(b&#039;,a&#039;)' title='(b,a)=(b&#039;,a&#039;)' class='latex' /> then (by the main property of ordered pairs) it follows that <img src='http://s0.wp.com/latex.php?latex=b%3Db%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b=b&#039;' title='b=b&#039;' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=a%3Da%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a=a&#039;' title='a=a&#039;' class='latex' />, and hence that <img src='http://s0.wp.com/latex.php?latex=%28a%2Cb%29%3D%28a%27%2Cb%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,b)=(a&#039;,b&#039;)' title='(a,b)=(a&#039;,b&#039;)' class='latex' /> (again by the main property of ordered pairs). </p>
<p>Now let us show that <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is a surjection. Let <img src='http://s0.wp.com/latex.php?latex=%28b%2Ca%29%5Cin+B%5Ctimes+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(b,a)&#92;in B&#92;times A' title='(b,a)&#92;in B&#92;times A' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28a%2Cb%29%3D%28b%2Ca%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(a,b)=(b,a)' title='&#92;phi(a,b)=(b,a)' class='latex' />. </p>
<p>Since <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is an injection and a surjection, it follows that it is a bijection, as was wanted. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p>Contrast that with the following proof.</p>
<p><strong>Proof 2.</strong> Define <img src='http://s0.wp.com/latex.php?latex=%5Cpsi%3AB%5Ctimes+A%5Cto+A%5Ctimes+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi:B&#92;times A&#92;to A&#92;times B' title='&#92;psi:B&#92;times A&#92;to A&#92;times B' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=%5Cpsi%28b%2Ca%29%3D%28a%2Cb%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi(b,a)=(a,b)' title='&#92;psi(b,a)=(a,b)' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=%5Cpsi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;psi' title='&#92;psi' class='latex' /> is an inverse for <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />. Therefore, <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' /> is a bijection. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p>Moral: if you need to prove that a function is a bijection, consider whether you can just write down an inverse.</p>
<p>More general moral: if you are dealing with bijections, bear in mind that &#8220;has an inverse&#8221; can be treated as a useful alternative definition, even if &#8220;officially&#8221; it is an equivalent property.</p>
<p>There is one exception to the second moral. If you are asked to prove that a function is a bijection if and only if it has an inverse, then it is not legitimate to interpret &#8220;is a bijection&#8221; as &#8220;has an inverse&#8221; and then argue that the statement is a tautology.</p>
<p>This touches on a question that many beginning mathematicians have asked: &#8220;What am I allowed to assume?&#8221; Many cases of this question are covered by the following principle.</p>
<li>If a definition is followed by some basic equivalent properties, then at some point you need to prove the equivalence. Once you have proved it, from that point on you can assume it. If you are doing an exercise, it should be clear whether you are being tested on the equivalence itself or on something that is &#8220;beyond&#8221; the equivalence.</li>
<p>A slightly more subtle situation occurs with the following example. Suppose you are asked to prove that the function <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' /> is a bijection from the real numbers to the positive real numbers. Following my advice, you might say, &#8220;<img src='http://s0.wp.com/latex.php?latex=%5Clog%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log(x)' title='&#92;log(x)' class='latex' /> is an inverse!&#8221; and leave it at that. That might be a legitimate argument, but only if your definition of <img src='http://s0.wp.com/latex.php?latex=%5Clog%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log(x)' title='&#92;log(x)' class='latex' /> is not &#8220;the inverse of <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' />&#8220;. And if you define <img src='http://s0.wp.com/latex.php?latex=%5Clog%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log(x)' title='&#92;log(x)' class='latex' /> in a different way &#8212; for example as <img src='http://s0.wp.com/latex.php?latex=%5Cint_1%5Ext%5E%7B-1%7Ddt&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;int_1^xt^{-1}dt' title='&#92;int_1^xt^{-1}dt' class='latex' /> &#8212; then it may take you quite a bit of work to prove that what you have defined really does invert the function <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' />. (It can be done, however. You might like to see whether you can use that integral definition of <img src='http://s0.wp.com/latex.php?latex=%5Clog%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log(x)' title='&#92;log(x)' class='latex' /> to prove that <img src='http://s0.wp.com/latex.php?latex=%5Clog%28xy%29%3D%5Clog%28x%29%2B%5Clog%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log(xy)=&#92;log(x)+&#92;log(y)' title='&#92;log(xy)=&#92;log(x)+&#92;log(y)' class='latex' />, and then use that property to prove that <img src='http://s0.wp.com/latex.php?latex=%5Clog%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log(x)' title='&#92;log(x)' class='latex' /> inverts some exponential function, and finally a bit of calculus to prove that the derivative of that exponential function at 0 is 1. However, it is easier &#8212; though less illuminating &#8212; to prove that <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' /> is an injection and a surjection. The former is true because <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' /> is a strictly increasing function, and the latter is a consequence of the intermediate value theorem, which is discussed later in the post.) </p>
<p><strong>2. Invertible matrices.</strong></p>
<p>An <img src='http://s0.wp.com/latex.php?latex=n%5Ctimes+n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;times n' title='n&#92;times n' class='latex' /> matrix <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is defined to be <em>invertible</em> if there is an <img src='http://s0.wp.com/latex.php?latex=n%5Ctimes+n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;times n' title='n&#92;times n' class='latex' /> matrix <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=AB%3DBA%3DI_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='AB=BA=I_n' title='AB=BA=I_n' class='latex' />, where <img src='http://s0.wp.com/latex.php?latex=I_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='I_n' title='I_n' class='latex' /> is the <img src='http://s0.wp.com/latex.php?latex=n%5Ctimes+n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;times n' title='n&#92;times n' class='latex' /> identity matrix. In a typical linear algebra course, the following statements are all shown to be equivalent to the statement that <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is invertible. (I&#8217;ll take the matrix to have real-number entries.)</p>
<p>(i) The only solution to the equation <img src='http://s0.wp.com/latex.php?latex=Ax%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Ax=0' title='Ax=0' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=x%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=0' title='x=0' class='latex' />. (Here, <img src='http://s0.wp.com/latex.php?latex=0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='0' title='0' class='latex' /> is denoting the column vector of height <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> that consists entirely of zeros.)</p>
<p>(ii) For every column vector <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> of height <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> the equation <img src='http://s0.wp.com/latex.php?latex=Ax%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Ax=b' title='Ax=b' class='latex' /> has a solution.</p>
<p>(iii) The row-rank of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />.</p>
<p>(iv) The column-rank of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />.</p>
<p>(v) The determinant of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is non-zero.</p>
<p>A matrix that is not invertible is called <em>singular</em>. If you want to decide whether a matrix is invertible or singular, you can choose any one of the properties (i)-(v) and see whether it holds. And if you know that a matrix has one of the five properties (i)-(v), then you are free to use any of the other properties. For example, if in the particular problem you are working on it happens to be easy to show that the determinant of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is zero, you could slip over to (i) and say, &#8220;Let us choose <img src='http://s0.wp.com/latex.php?latex=x%5Cne+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;ne 0' title='x&#92;ne 0' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=Ax%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Ax=0' title='Ax=0' class='latex' />.&#8221;</p>
<p>In this case, it is probably best to think of invertibility and singularity as <em>clusters</em> of related properties rather than single properties. At any rate, you shouldn&#8217;t be wedded to any particular property as &#8220;the main&#8221; one.</p>
<p><strong>3. Highest common factors.</strong></p>
<p>A <em>factor</em> of a positive integer <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is a positive integer <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=m%7Cn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m|n' title='m|n' class='latex' />. A <em>common factor</em> of two positive integers <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is a positive integer <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=d%7Cm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d|m' title='d|m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=d%7Cn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d|n' title='d|n' class='latex' />. The <em>highest common factor</em> of two positive integers <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is &#8230; well &#8230; the biggest of all the common factors. The phrase &#8220;highest common factor&#8221; is so suggestive that the temptation to think of the definition I have just given as the primary one is extremely strong. And yet for many problems it is not at all the most convenient definition to use. </p>
<p>What other equivalent definition is there? Well, it won&#8217;t be presented to you as an equivalent definition: more likely, it will be presented as part of the proof of a very important <em>fact</em> about highest common factors, sometimes known as B&eacute;zout&#8217;s theorem. It is the following result.</p>
<p><strong>Theorem.</strong> <em>Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> be two positive integers, and let <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> be the highest common factor of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />. Then there exist integers <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=hx%2Bky%3Dd&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hx+ky=d' title='hx+ky=d' class='latex' />.</em></p>
<p>How is this number <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> identified? It is the smallest positive integer that can be written in the form <img src='http://s0.wp.com/latex.php?latex=hx%2Bky&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hx+ky' title='hx+ky' class='latex' />. Once you have proved that the smallest positive integer <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> that can be written in the form <img src='http://s0.wp.com/latex.php?latex=hx%2Bky&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hx+ky' title='hx+ky' class='latex' /> is indeed a factor of both <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />, you immediately obtain as a consequence that every integer of the form <img src='http://s0.wp.com/latex.php?latex=hx%2Bky&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hx+ky' title='hx+ky' class='latex' /> is a multiple of <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' />. You also immediately obtain as a consequence that every multiple of <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> can be written in the form <img src='http://s0.wp.com/latex.php?latex=hx%2Bky&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hx+ky' title='hx+ky' class='latex' />. </p>
<p>That gives us two alternative definitions of the highest common factor of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />.</p>
<p>(i) The highest common factor of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> is the smallest positive integer of the form <img src='http://s0.wp.com/latex.php?latex=hx%2Bky&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hx+ky' title='hx+ky' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> are integers.</p>
<p>(ii) The highest common factor of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> is the positive integer <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> such that the set of integers of the form <img src='http://s0.wp.com/latex.php?latex=hx%2Bky&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hx+ky' title='hx+ky' class='latex' /> coincides with the set of all multiples of <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' />. </p>
<p>Now let&#8217;s see how we can put these alternative definitions to use. Suppose that we are asked to prove the following simple fact. I&#8217;ll use the standard notation <img src='http://s0.wp.com/latex.php?latex=%28m%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(m,n)' title='(m,n)' class='latex' /> to stand for the highest common factor of <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />.</p>
<p><strong>Exercise.</strong> Let <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> be positive integers. Suppose that <img src='http://s0.wp.com/latex.php?latex=a%7Cmn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a|mn' title='a|mn' class='latex' /> and that <img src='http://s0.wp.com/latex.php?latex=%28a%2Cm%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,m)=1' title='(a,m)=1' class='latex' />. Prove that <img src='http://s0.wp.com/latex.php?latex=a%7Cn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a|n' title='a|n' class='latex' />.</p>
<p>I think the majority of first-year undergraduates look at this statement and think something like this: &#8220;Every prime factor of <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> goes into <img src='http://s0.wp.com/latex.php?latex=mn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='mn' title='mn' class='latex' />, but it can&#8217;t go into <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> because <img src='http://s0.wp.com/latex.php?latex=%28a%2Cm%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,m)=1' title='(a,m)=1' class='latex' />, so it must go into <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />.&#8221; As it stands, that argument isn&#8217;t quite correct, because it ignores the case where a prime goes into <img src='http://s0.wp.com/latex.php?latex=mn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='mn' title='mn' class='latex' /> more than once. But it can be turned into an ugly proof if you want.</p>
<p>What is ugly about it? Two things. First, it uses the fundamental theorem of arithmetic (that every positive integer has a unique factorization into primes), when it could get away with a more basic tool, namely B&eacute;zout&#8217;s theorem. Secondly, writing out proofs of this kind properly involves writing expressions like <img src='http://s0.wp.com/latex.php?latex=p_1%5E%7Ba_1%7D%5Cdots+p_r%5E%7Ba_r%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p_1^{a_1}&#92;dots p_r^{a_r}' title='p_1^{a_1}&#92;dots p_r^{a_r}' class='latex' />, which is a pain.</p>
<p>But if you agree with my aesthetic sensibilities (which probably a fair percentage of you won&#8217;t, but I hope you&#8217;ll eventually come to change your minds), then there remains the question of <em>how</em> exactly you use B&eacute;zout&#8217;s theorem to prove a statement like this. And here is where alternative definitions come into play. In this case, they don&#8217;t really give you anything that you don&#8217;t get from the general instruction to use B&eacute;zout&#8217;s theorem, but they do at least keep you focused on that goal.</p>
<p>The statement that <img src='http://s0.wp.com/latex.php?latex=%28a%2Cm%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,m)=1' title='(a,m)=1' class='latex' /> becomes, if we are thinking about highest common factors in a B&eacute;zout&#8217;s-theorem kind of way, the statement that we can write 1 as <img src='http://s0.wp.com/latex.php?latex=ha%2Bkm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ha+km' title='ha+km' class='latex' />. We&#8217;re also given that <img src='http://s0.wp.com/latex.php?latex=a%7Cmn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a|mn' title='a|mn' class='latex' />, and we might like to note that that implies that <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> goes into any number of the form <img src='http://s0.wp.com/latex.php?latex=ra%2Bsmn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ra+smn' title='ra+smn' class='latex' />. Since we want to show that <img src='http://s0.wp.com/latex.php?latex=a%7Cn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a|n' title='a|n' class='latex' />, it makes sense to try to write <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> in the form <img src='http://s0.wp.com/latex.php?latex=ra%2Bsmn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ra+smn' title='ra+smn' class='latex' />. We know that <img src='http://s0.wp.com/latex.php?latex=1%3Dha%2Bkm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1=ha+km' title='1=ha+km' class='latex' />. It follows that <img src='http://s0.wp.com/latex.php?latex=n%3Dhan%2Bkmn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=han+kmn' title='n=han+kmn' class='latex' />, which is indeed of the desired form.</p>
<p>That proof is perhaps too similar to the proof that if a prime <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> divides a product <img src='http://s0.wp.com/latex.php?latex=ab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab' title='ab' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=p%7Ca&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|a' title='p|a' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=p%7Cb&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|b' title='p|b' class='latex' />. So let&#8217;s try another one. It&#8217;s the statement that if <img src='http://s0.wp.com/latex.php?latex=%28m%2Cn%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(m,n)=1' title='(m,n)=1' class='latex' /> and two numbers x and y are congruent mod <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and congruent mod <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, then they must be congruent mod <img src='http://s0.wp.com/latex.php?latex=mn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='mn' title='mn' class='latex' />. (If you haven&#8217;t met it yet, two numbers are congruent mod <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> if they differ by a multiple of <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />.)</p>
<p>Just before I start, let me point out that the condition <img src='http://s0.wp.com/latex.php?latex=%28m%2Cn%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(m,n)=1' title='(m,n)=1' class='latex' /> is necessary. For example, if <img src='http://s0.wp.com/latex.php?latex=m%3D8&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m=8' title='m=8' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n%3D12&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=12' title='n=12' class='latex' />, then 5 and 29 are congruent mod <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and congruent mod <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> but not congruent mod <img src='http://s0.wp.com/latex.php?latex=mn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='mn' title='mn' class='latex' />. </p>
<p>I am thinking of the statement <img src='http://s0.wp.com/latex.php?latex=%28m%2Cn%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(m,n)=1' title='(m,n)=1' class='latex' /> as telling me that I can write <img src='http://s0.wp.com/latex.php?latex=hm%2Bkn%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hm+kn=1' title='hm+kn=1' class='latex' /> for some integers <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />. That is what it <em>means</em> to me. (Note that it is more helpful because it tells me that something exists, namely <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />, rather than that something doesn&#8217;t exist, namely a non-trivial common factor.) What else am I given? I&#8217;m given that <img src='http://s0.wp.com/latex.php?latex=x-y%3Dam&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x-y=am' title='x-y=am' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x-y%3Dbn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x-y=bn' title='x-y=bn' class='latex' /> for some <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' />. I now want to prove that <img src='http://s0.wp.com/latex.php?latex=x-y%3Dcmn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x-y=cmn' title='x-y=cmn' class='latex' /> for some integer <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' />. </p>
<p>Let us write down the three equations we have and the equation we want.</p>
<p><img src='http://s0.wp.com/latex.php?latex=hm%2Bkn%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hm+kn=1' title='hm+kn=1' class='latex' /><br />
<img src='http://s0.wp.com/latex.php?latex=x-y%3Dam&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x-y=am' title='x-y=am' class='latex' /><br />
<img src='http://s0.wp.com/latex.php?latex=x-y%3Dbn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x-y=bn' title='x-y=bn' class='latex' /><br />
&#8212;&#8212;&#8212;&#8212;&#8212;-<br />
<img src='http://s0.wp.com/latex.php?latex=x-y%3Dcmn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x-y=cmn' title='x-y=cmn' class='latex' /></p>
<p>We don&#8217;t really care about <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />, so a fairly obvious thing to do is write <img src='http://s0.wp.com/latex.php?latex=am%3Dbn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='am=bn' title='am=bn' class='latex' /> and change our target to that of proving that <img src='http://s0.wp.com/latex.php?latex=am&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='am' title='am' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=bn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='bn' title='bn' class='latex' /> are both multiples of <img src='http://s0.wp.com/latex.php?latex=mn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='mn' title='mn' class='latex' />. But if we want to prove that <img src='http://s0.wp.com/latex.php?latex=am%3Dcmn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='am=cmn' title='am=cmn' class='latex' />, we are trying to prove that <img src='http://s0.wp.com/latex.php?latex=a%3Dcn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a=cn' title='a=cn' class='latex' />. That is, we want to prove that <img src='http://s0.wp.com/latex.php?latex=n%7Ca&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n|a' title='n|a' class='latex' /> (or we could if we wanted prove that <img src='http://s0.wp.com/latex.php?latex=m%7Cb&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m|b' title='m|b' class='latex' />).</p>
<p>So after a tiny bit of rearranging, the problem is this.</p>
<p><img src='http://s0.wp.com/latex.php?latex=hm%2Bkn%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hm+kn=1' title='hm+kn=1' class='latex' /><br />
<img src='http://s0.wp.com/latex.php?latex=am%3Dbn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='am=bn' title='am=bn' class='latex' /><br />
&#8212;&#8212;&#8212;&#8212;&#8211;<br />
<img src='http://s0.wp.com/latex.php?latex=n%7Ca&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n|a' title='n|a' class='latex' /></p>
<p>How do we use B&eacute;zout&#8217;s theorem to show that things divide other things? Well, in the previous proof we took a statement of the form <img src='http://s0.wp.com/latex.php?latex=hx%2Bky%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hx+ky=1' title='hx+ky=1' class='latex' /> and multiplied it by the number we wanted the other number to go into. Let&#8217;s do that here. We want <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> to go into <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />, so let&#8217;s multiply the first equation by <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and hope for the best. We get this.</p>
<p><img src='http://s0.wp.com/latex.php?latex=hma%2Bkna%3Da&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hma+kna=a' title='hma+kna=a' class='latex' /></p>
<p>We&#8217;ll be done if we can show that <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> goes into the left-hand side. Does it? Well, it certainly goes into the second term, but what about the first? </p>
<p>Hang on, we haven&#8217;t used all the information yet. What else did we know? Oh yes, that <img src='http://s0.wp.com/latex.php?latex=am%3Dbn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='am=bn' title='am=bn' class='latex' />. That tells us that <img src='http://s0.wp.com/latex.php?latex=hma%3Dhbn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='hma=hbn' title='hma=hbn' class='latex' />, so the first term is divisible by <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> too.</p>
<p>Here&#8217;s an exercise that&#8217;s well worth trying. See if you can prove that the lowest common multiple of <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=mn%2F%28m%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='mn/(m,n)' title='mn/(m,n)' class='latex' />, and see if you can do it without the help of the fundamental theorem of arithmetic. More precisely, your task is to prove that <img src='http://s0.wp.com/latex.php?latex=mn%2F%28m%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='mn/(m,n)' title='mn/(m,n)' class='latex' /> is a multiple of both <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, and that every number that&#8217;s a multiple of both <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is a multiple of <img src='http://s0.wp.com/latex.php?latex=mn%2F%28m%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='mn/(m,n)' title='mn/(m,n)' class='latex' />. </p>
<p>I think it is highly likely that many people reading this will be far from convinced that using B&eacute;zout&#8217;s theorem is better than simply writing out prime factorizations. Let me try to explain again why I prefer it (and am not alone in this view). It&#8217;s partly a distaste for using results that are &#8220;more advanced&#8221; than what you are trying to prove. An extreme example is the result that if <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is a prime that divides <img src='http://s0.wp.com/latex.php?latex=ab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab' title='ab' class='latex' /> then either <img src='http://s0.wp.com/latex.php?latex=p%7Ca&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|a' title='p|a' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=p%7Cb&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p|b' title='p|b' class='latex' />. That result is (usually) used to prove the fundamental theorem or arithmetic, so you shouldn&#8217;t use the fundamental theorem of arithmetic to prove <em>it</em>. Since the other results are of a similar flavour to that one, it seems somehow more appropriate to use similar techniques.</p>
<p>Another potential advantage is that there are algebraic structures called <em>rings</em> that are somewhat like the integers (in that you can add and multiply elements together but you can&#8217;t necessarily divide them) in which the fundamental theorem of arithmetic does not hold. I&#8217;m not enough of an algebraist (or algebraic number theorist) to have examples at my fingertips, but I am almost sure that there are results that can be generalized from the integers to more general rings, but only if you use a B&eacute;zout-type proof. If anyone can supply me with an example, I will be very grateful. (The basic point, however, is that in some rings unique factorization doesn&#8217;t hold. In such rings, it is no longer clear that any two elements <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> must have a common factor that&#8217;s a multiple of all other common factors. But we can still look at the set of numbers of the form <img src='http://s0.wp.com/latex.php?latex=ax%2Bby&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ax+by' title='ax+by' class='latex' />, which forms an object called an <em>ideal</em>. An example of a ring in which unique factorization fails is the set of all numbers of the form <img src='http://s0.wp.com/latex.php?latex=m%2Bn%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m+n&#92;sqrt{-5}' title='m+n&#92;sqrt{-5}' class='latex' /> where <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> are integers. In this ring the number 6 can be factorized as <img src='http://s0.wp.com/latex.php?latex=2%5Ctimes+3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2&#92;times 3' title='2&#92;times 3' class='latex' /> and also as <img src='http://s0.wp.com/latex.php?latex=%281%2B%5Csqrt%7B-5%7D%29%281-%5Csqrt%7B-5%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1+&#92;sqrt{-5})(1-&#92;sqrt{-5})' title='(1+&#92;sqrt{-5})(1-&#92;sqrt{-5})' class='latex' />. In this ring, 6 and <img src='http://s0.wp.com/latex.php?latex=2%281%2B%5Csqrt%7B-5%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2(1+&#92;sqrt{-5})' title='2(1+&#92;sqrt{-5})' class='latex' /> have both 2 and <img src='http://s0.wp.com/latex.php?latex=1%2B%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1+&#92;sqrt{-5}' title='1+&#92;sqrt{-5}' class='latex' /> as common factors, but there isn&#8217;t &#8212; I&#8217;m pretty sure &#8212; a common factor that&#8217;s a multiple of both 2 and <img src='http://s0.wp.com/latex.php?latex=1%2B%5Csqrt%7B-5%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1+&#92;sqrt{-5}' title='1+&#92;sqrt{-5}' class='latex' />.)</p>
<p><strong>4. Normal subgroups.</strong></p>
<p>If you haven&#8217;t met normal subgroups yet, you will do soon. Here is the usual definition.</p>
<p><strong>Definition.</strong> Let <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> be a group and let <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> be a subgroup of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' />. We say that <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is <em>normal</em> if <img src='http://s0.wp.com/latex.php?latex=ghg%5E%7B-1%7D%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ghg^{-1}&#92;in H' title='ghg^{-1}&#92;in H' class='latex' /> whenever <img src='http://s0.wp.com/latex.php?latex=g%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in G' title='g&#92;in G' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=h%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in H' title='h&#92;in H' class='latex' />. </p>
<p>If you are given a subgroup <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> and asked to prove that it is normal, then the obvious thing to do is look at an arbitrary group element of the form <img src='http://s0.wp.com/latex.php?latex=ghg%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ghg^{-1}' title='ghg^{-1}' class='latex' /> and show that it belongs to <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. But that is often not the simplest way and even more often not the way that gives you the greatest insight into <em>why</em> <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is normal.</p>
<p>What is the alternative definition here? Well, a very important fact about normal subgroups (I would call it the most important fact myself) is this.</p>
<p><strong>Proposition.</strong> A subgroup <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> of a group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is normal if and only if there exists a group <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' /> and a homomorphism <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AG%5Cto+K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:G&#92;to K' title='&#92;phi:G&#92;to K' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is the kernel of <img src='http://s0.wp.com/latex.php?latex=%5Cphi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi' title='&#92;phi' class='latex' />. </p>
<p>If you don&#8217;t know what homomorphisms and kernels are, then there are two things you could do at this point. One is to come back and read this when you&#8217;ve been shown them in lectures. Another is to <a href="http://en.wikipedia.org/wiki/Group_homomorphism">look them up on Wikipedia</a> &#8212; they aren&#8217;t that difficult. If you decide to skip this section and never come back to it, then please at least go away with this message in mind: <em>it is often better to think of normal subgroups as kernels of homomorphisms rather than as subgroups that satisfy the condition in the original definition</em>.</p>
<p>Let me give a simple example of a problem where thinking of normal subgroups this way is helpful. Recall that the <em>dihedral group</em> <img src='http://s0.wp.com/latex.php?latex=D_%7B2n%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_{2n}' title='D_{2n}' class='latex' /> is the symmetry group of a regular <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />-gon. (Some people write <img src='http://s0.wp.com/latex.php?latex=D_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_n' title='D_n' class='latex' /> for this but I think <img src='http://s0.wp.com/latex.php?latex=D_%7B2n%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_{2n}' title='D_{2n}' class='latex' /> is the Cambridge standard.) This splits up into <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> rotations and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> reflections. The rotations form a subgroup and that subgroup is normal. Why?</p>
<p>I won&#8217;t bother to give the argument that works directly from the definition. Instead, I want to show that the group of rotations is the kernel of some homomorphism. That is, I want to find a group <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' /> and a homomorphism <img src='http://s0.wp.com/latex.php?latex=%5Cphi%3AD_%7B2n%7D%5Cto+K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi:D_{2n}&#92;to K' title='&#92;phi:D_{2n}&#92;to K' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%5Cphi%28g%29%3De&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;phi(g)=e' title='&#92;phi(g)=e' class='latex' /> if and only if <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> is a rotation.</p>
<p>There are many ways to think about this. Probably the best is to use the concept of a group action, but since you haven&#8217;t got to that yet I will avoid it. (However, later on I plan a post on group actions and I&#8217;ll come back to this point.) Instead, let me simply define a homomorphism from <img src='http://s0.wp.com/latex.php?latex=D_%7B2n%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D_{2n}' title='D_{2n}' class='latex' /> to the 2-element group, which I&#8217;ll think of as <img src='http://s0.wp.com/latex.php?latex=%5C%7B-1%2C1%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{-1,1&#92;}' title='&#92;{-1,1&#92;}' class='latex' /> under multiplication, by sending all rotations to <img src='http://s0.wp.com/latex.php?latex=1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1' title='1' class='latex' /> and all reflections to <img src='http://s0.wp.com/latex.php?latex=-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='-1' title='-1' class='latex' />. Basically, a transformation maps to 1 if you don&#8217;t have to turn your polygon over and to -1 if you do. (If you want, you can think of this homomorphism as the determinant of the linear map that does the transformation.) </p>
<p>What this simple example illustrates is that normal subgroups are subgroups that &#8220;leave something alone&#8221;. In this case, what the rotations leave alone is the way up that the polygon is. If you can find a proof of this kind, then you &#8220;feel the normality&#8221; in a way that you don&#8217;t if all you&#8217;ve done some calculation that happens to show that <img src='http://s0.wp.com/latex.php?latex=ghg%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ghg^{-1}' title='ghg^{-1}' class='latex' /> belongs to <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />. And sometimes it is much simpler. For example, the set of <img src='http://s0.wp.com/latex.php?latex=n%5Ctimes+n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;times n' title='n&#92;times n' class='latex' /> real matrices of determinant 1 is a normal subgroup of the set of all non-singular <img src='http://s0.wp.com/latex.php?latex=n%5Ctimes+n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;times n' title='n&#92;times n' class='latex' /> real matrices. I have seen supervisees prove this directly using the multiplicative property of the determinant (that det<img src='http://s0.wp.com/latex.php?latex=%28AB%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(AB)' title='(AB)' class='latex' />=det<img src='http://s0.wp.com/latex.php?latex=%28A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(A)' title='(A)' class='latex' />det<img src='http://s0.wp.com/latex.php?latex=%28B%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(B)' title='(B)' class='latex' />) without noticing that that very same property shows that the determinant is a homomorphism to the non-zero real numbers and that the subgroup in question is the kernel of that homomorphism.</p>
<p>Another nice example is the subgroup of the alternating group <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> that consists of the identity and the three transposition pairs <img src='http://s0.wp.com/latex.php?latex=%2812%29%2834%29%2C+%2813%29%2824%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)(34), (13)(24)' title='(12)(34), (13)(24)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%2814%29%2823%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(14)(23)' title='(14)(23)' class='latex' />. It isn&#8217;t too hard to check that these form a subgroup, but why is it normal? </p>
<p>One answer is that conjugating a permutation always gives you a permutation of the same cycle type. Since we have included all transposition pairs, this subgroup must be normal. But that answer doesn&#8217;t really give me any feel for what is special about the subgroup. Here is another argument. We can identify <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> with the group of rotations of a regular tetrahedron. (For example, if we label the places where a vertex can go by the numbers 1, 2, 3 and 4, then the 3-cycle <img src='http://s0.wp.com/latex.php?latex=%28123%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(123)' title='(123)' class='latex' /> is the 120-degree rotation that fixes the vertex in place 4 and sends the vertex in place 1 to the vertex in place 2, the vertex in place 2 to the vertex in place 3, and the vertex in place 3 to the vertex in place 1.)</p>
<p>Now for each pair of opposite edges of the tetrahedron, we can draw a line joining the two midpoints. This gives us three lines that go through the centre. If we label the places where these lines can be with the numbers 1, 2 and 3, then with any rotation of the tetrahedron we can look at what it does to the lines in those places and define a corresponding permutation of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C3%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,3&#92;}' title='&#92;{1,2,3&#92;}' class='latex' />. If, for instance, it sends the line in place 1 to the line in place 2 and vice versa then we associate with it the permutation <img src='http://s0.wp.com/latex.php?latex=%2812%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)' title='(12)' class='latex' />. </p>
<p>It is not hard to check that this gives us a homomorphism from <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=S_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_3' title='S_3' class='latex' />. The kernel of this homomorphism is the set of permutations in <img src='http://s0.wp.com/latex.php?latex=A_4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A_4' title='A_4' class='latex' /> that correspond to rotations that send each of the three lines to itself (but possibly rotating it through 180 degrees about the centre of the tetrahedron). And if you think about it a bit, you will see that the rotations that do that are the identity and the ones that take one of those three lines and rotate about it by 180 degrees. And those three rotations correspond to transposition pairs.</p>
<p>This second argument is longer and &#8212; at least the way I have phrased it (trying to avoid talking about group actions) &#8212; harder to understand. But it explains the normality of that subgroup in a way that <em>means</em> something and that can be <em>visualized</em>, as opposed to the other argument, which just does a calculation that mysteriously works.</p>
<p><strong>5. Continuous functions.</strong></p>
<p>You don&#8217;t meet continuous functions until next term. However, I think that if you read this section and skim over what you don&#8217;t understand, you will still get the point of what I am saying. And when you <em>have</em> come across continuous functions, then you may feel like coming back and having another look at this section.</p>
<p>The basic definition of a continuous function from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> is this.</p>
<li>A function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> is <em>continuous</em> if for every <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0' title='&#92;epsilon&gt;0' class='latex' /> and every <img src='http://s0.wp.com/latex.php?latex=x%5Cin%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in&#92;mathbb{R}' title='x&#92;in&#92;mathbb{R}' class='latex' /> there exists <img src='http://s0.wp.com/latex.php?latex=%5Cdelta%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;delta&gt;0' title='&#92;delta&gt;0' class='latex' /> such that for every <img src='http://s0.wp.com/latex.php?latex=y%5Cin%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in&#92;mathbb{R}' title='y&#92;in&#92;mathbb{R}' class='latex' />, if <img src='http://s0.wp.com/latex.php?latex=%7Cx-y%7C%3C%5Cdelta&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|x-y|&lt;&#92;delta' title='|x-y|&lt;&#92;delta' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=%7Cf%28x%29-f%28y%29%7C%3C%5Cepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|f(x)-f(y)|&lt;&#92;epsilon' title='|f(x)-f(y)|&lt;&#92;epsilon' class='latex' />.</li>
<p>Fairly soon after that definition has been presented, it is customary to present the following result.</p>
<li>A function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> is continuous if and only if for every sequence <img src='http://s0.wp.com/latex.php?latex=%28a_n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a_n)' title='(a_n)' class='latex' /> that converges to a limit <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> the sequence <img src='http://s0.wp.com/latex.php?latex=%28f%28a_n%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(f(a_n))' title='(f(a_n))' class='latex' /> converges to <img src='http://s0.wp.com/latex.php?latex=f%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(a)' title='f(a)' class='latex' />.</li>
<p>Since that is an equivalence, it can be used as an alternative definition. What are the signs that it is a more suitable definition? A rather simple and obvious one is that the proof so far should already have mentioned a convergent sequence, or, slightly less obviously, that you might like to bring in a convergent sequence later.</p>
<p>For example, there is a very useful tool in real analysis called the Bolzano-Weierstrass theorem. It says, &#8220;Every sequence in a closed bounded interval has a convergent subsequence.&#8221; If you&#8217;re planning to use that theorem, or have already used it, then the sequences definition of continuity is likely to be more convenient than the original definition.</p>
<p>Here is a second example. A well-known theorem of real analysis called the intermediate value theorem says that if <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a continuous function and <img src='http://s0.wp.com/latex.php?latex=f%28a%29%3C0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(a)&lt;0' title='f(a)&lt;0' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=f%28b%29%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(b)&gt;0' title='f(b)&gt;0' class='latex' /> then there is some <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' /> between <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28c%29%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(c)=0' title='f(c)=0' class='latex' />. One proof of this result starts like this. We&#8217;ll define two sequences <img src='http://s0.wp.com/latex.php?latex=a_0%2Ca_1%2Ca_2%2C%5Cdots&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_0,a_1,a_2,&#92;dots' title='a_0,a_1,a_2,&#92;dots' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b_0%2Cb_1%2Cb_2%2C%5Cdots&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b_0,b_1,b_2,&#92;dots' title='b_0,b_1,b_2,&#92;dots' class='latex' /> as follows. [Already we have sequences, so already we should be thinking about using the sequence definition of continuity.] We start with <img src='http://s0.wp.com/latex.php?latex=a_0%3Da&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_0=a' title='a_0=a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b_0%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b_0=b' title='b_0=b' class='latex' />. If <img src='http://s0.wp.com/latex.php?latex=f%28%28a_0%2Bb_0%29%2F2%29%5Cleq+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f((a_0+b_0)/2)&#92;leq 0' title='f((a_0+b_0)/2)&#92;leq 0' class='latex' /> then define that to be <img src='http://s0.wp.com/latex.php?latex=a_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_1' title='a_1' class='latex' /> and let <img src='http://s0.wp.com/latex.php?latex=b_1%3Db_0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b_1=b_0' title='b_1=b_0' class='latex' />, while if <img src='http://s0.wp.com/latex.php?latex=f%28%28a_0%2Bb_0%29%2F2%29%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f((a_0+b_0)/2)&gt;0' title='f((a_0+b_0)/2)&gt;0' class='latex' /> then define that to be <img src='http://s0.wp.com/latex.php?latex=b_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b_1' title='b_1' class='latex' /> and let <img src='http://s0.wp.com/latex.php?latex=a_1%3Da_0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_1=a_0' title='a_1=a_0' class='latex' />. Continue this process, each time replacing one of <img src='http://s0.wp.com/latex.php?latex=a_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_n' title='a_n' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b_n' title='b_n' class='latex' /> by the average and leaving the other one unchanged, and doing it in such a way that at each stage <img src='http://s0.wp.com/latex.php?latex=f%28a_n%29%5Cleq+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(a_n)&#92;leq 0' title='f(a_n)&#92;leq 0' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=f%28b_n%29%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(b_n)&gt;0' title='f(b_n)&gt;0' class='latex' />. </p>
<p>A basic axiom for the real numbers tells us that <img src='http://s0.wp.com/latex.php?latex=a_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_n' title='a_n' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b_n' title='b_n' class='latex' /> both converge, and simple results can be used to show that they converge to the same limit (something that might seem obvious, but it needs a proof). That limit is our <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' />, and it remains to prove that <img src='http://s0.wp.com/latex.php?latex=f%28c%29%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(c)=0' title='f(c)=0' class='latex' />. </p>
<p>It is here that the sequence definition of continuity comes into play. Since <img src='http://s0.wp.com/latex.php?latex=a_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_n' title='a_n' class='latex' /> converges to <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' />, we know that <img src='http://s0.wp.com/latex.php?latex=f%28a_n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(a_n)' title='f(a_n)' class='latex' /> converges to <img src='http://s0.wp.com/latex.php?latex=f%28c%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(c)' title='f(c)' class='latex' />. Since <img src='http://s0.wp.com/latex.php?latex=f%28a_n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(a_n)' title='f(a_n)' class='latex' /> is always non-positive, so is <img src='http://s0.wp.com/latex.php?latex=f%28c%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(c)' title='f(c)' class='latex' />. (This is another simple principle from real analysis.) Similarly, <img src='http://s0.wp.com/latex.php?latex=f%28b_n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(b_n)' title='f(b_n)' class='latex' /> is always non-negative, which implies that <img src='http://s0.wp.com/latex.php?latex=f%28c%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(c)' title='f(c)' class='latex' /> is non-negative. The only number that is non-negative and non-positive is 0, so we are done.</p>
<p>It is perfectly possible to show that <img src='http://s0.wp.com/latex.php?latex=f%28c%29%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(c)=0' title='f(c)=0' class='latex' /> using the original definition of continuity, but the proof is longer and repeats some of the steps that are used to prove that the sequences definition follows from the original definition. Once one has made the effort to prove the sequences definition, one might as well use it.</p>
<p>The sequences definition is by no means the end of the story. For example, another basic result about continuous functions (that applies in a much more general context) is this.</p>
<li>A function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> is continuous if and only if <img src='http://s0.wp.com/latex.php?latex=f%5E%7B-1%7D%28A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f^{-1}(A)' title='f^{-1}(A)' class='latex' /> is open for every open set <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />.</li>
<p>And that has an equivalent formulation that is sometimes more convenient to use.</p>
<li>A function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> is continuous if and only if <img src='http://s0.wp.com/latex.php?latex=f%5E%7B-1%7D%28A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f^{-1}(A)' title='f^{-1}(A)' class='latex' /> is closed for every closed set <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />.</li>
<p>And sometimes there are alternative definitions that work in particular contexts. For example, there is a mathematical concept known as a normed space. By far the most useful definition of continuity in the theory of normed spaces is this &#8212; but it applies only to certain sorts of functions.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> be normed spaces. A linear map <img src='http://s0.wp.com/latex.php?latex=T%3AX%5Cto+Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='T:X&#92;to Y' title='T:X&#92;to Y' class='latex' /> is continuous if and only if there exists a constant <img src='http://s0.wp.com/latex.php?latex=C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C' title='C' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%5C%7CTx%5C%7C%5Cleq+C%5C%7Cx%5C%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;|Tx&#92;|&#92;leq C&#92;|x&#92;|' title='&#92;|Tx&#92;|&#92;leq C&#92;|x&#92;|' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X' title='x&#92;in X' class='latex' />.</li>
<p>Again, it doesn&#8217;t matter if you haven&#8217;t the faintest idea what that means. The point is that it looks different from the usual definition, but it is equivalent to it in this particular context and is much easier to use. </p>
<p><strong>6. Differentiable functions.</strong></p>
<p>I am not going to give a list of alternative definitions of differentiability. Instead, this section is here to give me an excuse to direct you towards <a href="http://arxiv.org/pdf/math/9404236v1">a famous essay of William Thurston</a> that includes, near the beginning, a discussion of the numerous ways that mathematicians have of thinking about differentiation.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3590/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3590/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3590/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3590/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3590/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3590/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3590/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3590/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3590/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3590/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3590/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3590/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3590/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3590/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3590&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/10/25/alternative-definitions/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Definitions</title>
		<link>http://gowers.wordpress.com/2011/10/23/definitions/</link>
		<comments>http://gowers.wordpress.com/2011/10/23/definitions/#comments</comments>
		<pubDate>Sun, 23 Oct 2011 22:07:23 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[General concepts]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3568</guid>
		<description><![CDATA[By now you will have seen several definitions in lectures. Many of them will be written in the form Definition. A blah is &#8230; That is, the definition is displayed and the word being defined is in italics (or underlined if somebody is writing by hand). Sometimes, one doesn&#8217;t bother with the display, and simply [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3568&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>By now you will have seen several definitions in lectures. Many of them will be written in the form </p>
<p><strong>Definition.</strong> A <em>blah</em> is &#8230;</p>
<p>That is, the definition is displayed and the word being defined is in italics (or underlined if somebody is writing by hand). Sometimes, one doesn&#8217;t bother with the display, and simply says, during a discussion, &#8220;We define a <em>blah</em> to be &#8230;&#8221;</p>
<p>What is likely to have been emphasized less is that there are several different kinds of definition. In this post I&#8217;d like to enumerate some of them and give examples. It&#8217;s very much worth being aware, each time you meet a definition, what kind it is.<br />
<span id="more-3568"></span></p>
<p>Before I start, I should make clear that there is some overlap between some of the categories of definition below.</p>
<p><strong>1. Mere abbreviations.</strong> </p>
<p>Some definitions are little more than convenient abbreviations. For instance, it is annoying to keep writing <img src='http://s0.wp.com/latex.php?latex=%5C%7Bx%3Af%28x%29%5Cin+A%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{x:f(x)&#92;in A&#92;}' title='&#92;{x:f(x)&#92;in A&#92;}' class='latex' /> so instead we write <img src='http://s0.wp.com/latex.php?latex=f%5E%7B-1%7D%28A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f^{-1}(A)' title='f^{-1}(A)' class='latex' />. And having decided that the ratio of the circumference of a circle to its diameter is important (<a href="http://www.math.utah.edu/%7Epalais/pi.pdf">or not, as the case may be</a>), we prefer to write <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> rather than something like &#8220;half the circumference of a unit circle&#8221;.</p>
<p>A slightly different example (because what is being defined is an adjective) is saying &#8220;even&#8221; instead of &#8220;divisible by 2&#8243; and saying &#8220;odd&#8221; instead of &#8220;not divisible by 2&#8243;. </p>
<p>I don&#8217;t find it all that easy to think of examples of &#8220;mere&#8221; abbreviations, because almost all definitions do something more. &#8220;Equivalence relation&#8221; perhaps counts, since &#8220;<img src='http://s0.wp.com/latex.php?latex=R&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R' title='R' class='latex' /> is an equivalence relation&#8221; is short(ish) for &#8220;<img src='http://s0.wp.com/latex.php?latex=R&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R' title='R' class='latex' /> is reflexive, symmetric and transitive&#8221;. </p>
<p>However, that example shows that even mere abbreviations aren&#8217;t completely &#8220;mere&#8221;, since the fact that one bothers to come up with the abbreviation is a signal that the concept is worth naming. (This is similar to real-life examples such as UK, CIA, DPMMS, &#8220;bike&#8221;, &#8220;mobile&#8221;, etc.)</p>
<p><strong>2. Definitions that replace entire sentences by single words or short phrases.</strong></p>
<p>A positive integer <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> not equal to 1 is <em>prime</em> if its only factors are 1 and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />. Why do I think of this as a slightly different kind of definition from the &#8220;mere&#8221; abbreviations? </p>
<p>Consider the following two sentences.</p>
<li>Every even number apart from 2 is a sum of two primes.</li>
<li>Every number divisible by 2 apart from 2 is a sum of two primes.</li>
<p>Although I had to put &#8220;divisible by 2&#8243; after &#8220;number&#8221;, basically all I did was replace &#8220;even&#8221; by &#8220;divisible by 2&#8243;. That is the sense in which I am calling &#8220;even&#8221; a &#8220;mere&#8221; abbreviation of &#8220;divisible by 2&#8243;. </p>
<p>What if I also wanted to do without the word &#8220;primes&#8221;? I would have to write something much more convoluted like this.</p>
<li>Every number divisible by 2 apart from 2 is a sum of two positive integers, each of which is not equal to 1 and has no factors apart from 1 and itself.</li>
<p>Here are a few more definitions of a similar kind.</p>
<li>A subset <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> is <em>bounded</em> if there exists <img src='http://s0.wp.com/latex.php?latex=C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C' title='C' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=a%5Cleq+C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;leq C' title='a&#92;leq C' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=a%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in A' title='a&#92;in A' class='latex' />.</li>
<li>A sequence <img src='http://s0.wp.com/latex.php?latex=%28a_n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a_n)' title='(a_n)' class='latex' /> of real numbers is <em>convergent</em> if there exists <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> such that for every <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0' title='&#92;epsilon&gt;0' class='latex' /> there exists <img src='http://s0.wp.com/latex.php?latex=N%5Cin%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N&#92;in&#92;mathbb{N}' title='N&#92;in&#92;mathbb{N}' class='latex' /> such that for every <img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N' title='n&#92;geq N' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=%7Ca_n-a%7C%3C%5Cepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a_n-a|&lt;&#92;epsilon' title='|a_n-a|&lt;&#92;epsilon' class='latex' />.</li>
<li>A function <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> is an <em>injection</em> if <img src='http://s0.wp.com/latex.php?latex=x%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=y' title='x=y' class='latex' /> whenever <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(y)' title='f(x)=f(y)' class='latex' />.</li>
<li>A subgroup <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> of a group <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> is <em>normal</em> if <img src='http://s0.wp.com/latex.php?latex=ghg%5E%7B-1%7D%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ghg^{-1}&#92;in H' title='ghg^{-1}&#92;in H' class='latex' /> whenever <img src='http://s0.wp.com/latex.php?latex=g%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in G' title='g&#92;in G' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=h%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in H' title='h&#92;in H' class='latex' />.</li>
<p>If I want to replace the word being defined by what it means, I don&#8217;t just stick some slightly longer phrase where the word was: I end up rewriting the whole sentence. For instance, if I want to explain what it means to say that every right coset of a normal subgroup is equal to a left coset, I have to say something like, &#8220;Let <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> be a subgroup of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> and suppose that <img src='http://s0.wp.com/latex.php?latex=ghg%5E%7B-1%7D%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ghg^{-1}&#92;in H' title='ghg^{-1}&#92;in H' class='latex' /> whenever <img src='http://s0.wp.com/latex.php?latex=g%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;in G' title='g&#92;in G' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=h%5Cin+H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h&#92;in H' title='h&#92;in H' class='latex' />. Then every left coset of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> is equal to a right coset of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' />.&#8221;</p>
<p><strong>3. Strange new definitions of concepts you thought were already defined.</strong></p>
<p>A few concepts that you may have seen &#8220;defined&#8221; in this sense are positive integers, integers, rational numbers, real numbers, complex numbers (not to mention how to add and multiply all these kinds of numbers), ordered pairs, functions, relations, binary operations, and sequences. </p>
<p>Let me list how those are defined. (Apologies in advance if I get any of them wrong.)</p>
<li>A <em>positive integer</em> is a non-empty finite set <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> with the following two properties: its elements are totally ordered by the relation <img src='http://s0.wp.com/latex.php?latex=%5Cin&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;in' title='&#92;in' class='latex' />, and every element of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is also a subset of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />.</li>
<li>Define an equivalence relation <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> on ordered pairs of positive integers by <img src='http://s0.wp.com/latex.php?latex=%28a%2Cb%29%5Csim%28c%2Cd%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,b)&#92;sim(c,d)' title='(a,b)&#92;sim(c,d)' class='latex' /> if <img src='http://s0.wp.com/latex.php?latex=a%2Bd%3Db%2Bc&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a+d=b+c' title='a+d=b+c' class='latex' />. An <em>integer</em> is an equivalence class of <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' />.</li>
<li>Define an equivalence relation <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> on ordered pairs <img src='http://s0.wp.com/latex.php?latex=%28p%2Cq%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(p,q)' title='(p,q)' class='latex' /> of integers with <img src='http://s0.wp.com/latex.php?latex=q%5Cne+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q&#92;ne 0' title='q&#92;ne 0' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=%28p%2Cq%29%5Csim%28r%2Cs%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(p,q)&#92;sim(r,s)' title='(p,q)&#92;sim(r,s)' class='latex' /> if <img src='http://s0.wp.com/latex.php?latex=ps%3Dqr&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ps=qr' title='ps=qr' class='latex' />. A <em>rational number</em> is an equivalence class of <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' />.</li>
<li>Define an equivalence relation <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> on Cauchy sequences of rationals by saying that <img src='http://s0.wp.com/latex.php?latex=%28a_n%29%5Csim%28b_m%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a_n)&#92;sim(b_m)' title='(a_n)&#92;sim(b_m)' class='latex' /> if the sequence <img src='http://s0.wp.com/latex.php?latex=a_1%2Cb_1%2Ca_2%2Cb_2%2C%5Cdots&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_1,b_1,a_2,b_2,&#92;dots' title='a_1,b_1,a_2,b_2,&#92;dots' class='latex' /> is Cauchy. A real number is an equivalence class of <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' />.</li>
<li>A <em>complex number</em> is an ordered pair of real numbers.</li>
<p>The definitions so far are of little value until one defines algebraic operations. I won&#8217;t give all those, but here&#8217;s one.</p>
<li>Given two complex numbers <img src='http://s0.wp.com/latex.php?latex=%28a%2Cb%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,b)' title='(a,b)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%28c%2Cd%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(c,d)' title='(c,d)' class='latex' /> we define their <em>product</em> <img src='http://s0.wp.com/latex.php?latex=%28a%2Cb%29%28c%2Cd%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a,b)(c,d)' title='(a,b)(c,d)' class='latex' /> to be <img src='http://s0.wp.com/latex.php?latex=%28ac-bd%2Cad%2Bbc%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(ac-bd,ad+bc)' title='(ac-bd,ad+bc)' class='latex' />.</li>
<li>Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> be two sets. The <em>ordered pair</em> <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)' title='(x,y)' class='latex' /> is the set <img src='http://s0.wp.com/latex.php?latex=%5C%7Bx%2C%5C%7Bx%2Cy%5C%7D%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{x,&#92;{x,y&#92;}&#92;}' title='&#92;{x,&#92;{x,y&#92;}&#92;}' class='latex' />.</li>
<li>A <em>function</em> from a set <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> to a set <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> is a subset <img src='http://s0.wp.com/latex.php?latex=F%5Csubset+A%5Ctimes+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F&#92;subset A&#92;times B' title='F&#92;subset A&#92;times B' class='latex' /> with the property that for every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> there is exactly one <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29%5Cin+F&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)&#92;in F' title='(x,y)&#92;in F' class='latex' />.</li>
<li>A <em>relation</em> on a set <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is a subset of <img src='http://s0.wp.com/latex.php?latex=A%5Ctimes+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;times A' title='A&#92;times A' class='latex' />.</li>
<li>A <em>binary operation</em> on a set <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is a function from <img src='http://s0.wp.com/latex.php?latex=A%5Ctimes+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;times A' title='A&#92;times A' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />.</li>
<li>A <em>sequence</em> of real numbers is a function from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' />.</li>
<p>It is customary to present definitions of this kind as though they were getting to the essence of the concept being defined. Based on examples like &#8220;is less than&#8221; or &#8220;is a factor of&#8221; or &#8220;is congruent mod 7 to&#8221;, you might have thought that a relation on a set <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> was a potential relationship between pairs of elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> that holds for some pairs and not for others, but actually, you are told, it&#8217;s just a subset of <img src='http://s0.wp.com/latex.php?latex=A%5Ctimes+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;times A' title='A&#92;times A' class='latex' />. </p>
<p>When you are presented with one of these &#8220;but actually&#8221; definitions, you should go through the following process.</p>
<p>(i) Understand how your intuitive understanding of the concept being defined relates to the formal definition you are presented with.</p>
<p>(ii) Continue to use the intuitive understanding, turning to the formal definition if you are ever in danger of getting muddled, or if you want to make general statements about the concept in question.</p>
<p>(iii) Think about what properties of the intuitive concept the formal definition is trying to capture.</p>
<p>Just to clarify what I mean by (ii), suppose you were asked a simple question like, &#8220;How many relations are there on a set <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> of size <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />?&#8221; If you think of a relation as a way of relating elements of the set, then this could seem a difficult question. If you think of it as a subset of <img src='http://s0.wp.com/latex.php?latex=A%5Ctimes+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;times A' title='A&#92;times A' class='latex' />, then you will see instantly (I hope) that the answer is <img src='http://s0.wp.com/latex.php?latex=2%5E%7Bn%5E2%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2^{n^2}' title='2^{n^2}' class='latex' />. </p>
<p>Let&#8217;s try to do (i), again using relations as our example. How does the subset-of-<img src='http://s0.wp.com/latex.php?latex=A%5Ctimes+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;times A' title='A&#92;times A' class='latex' /> definition correspond to the relating-things definition? Well, if I have some way of relating things, it will be expressed as a sentence with gaps into which you insert two elements <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />, both of which can range over all of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />. For example, if <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is the set of all positive integers, we might have <img src='http://s0.wp.com/latex.php?latex=%2A%5Cleq+2%2A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='*&#92;leq 2*' title='*&#92;leq 2*' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=%2A%5Cnot+%7C+%2A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='*&#92;not | *' title='*&#92;not | *' class='latex' /> as a relation on <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />. Once I have two elements <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> I can use the relation to form sentences such as <img src='http://s0.wp.com/latex.php?latex=x%5Cleq+2y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;leq 2y' title='x&#92;leq 2y' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x%5Cnot+%7C+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;not | y' title='x&#92;not | y' class='latex' />. </p>
<p>To convert something like that into a subset of <img src='http://s0.wp.com/latex.php?latex=A%5Ctimes+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;times A' title='A&#92;times A' class='latex' /> I just take the set of all ordered pairs <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)' title='(x,y)' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is related to <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> in the way stated.</p>
<p>In the other direction, if I&#8217;m given a subset <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=A%5Ctimes+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;times A' title='A&#92;times A' class='latex' />, I can define a relating-things relation <img src='http://s0.wp.com/latex.php?latex=R&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R' title='R' class='latex' /> by setting <img src='http://s0.wp.com/latex.php?latex=xRy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='xRy' title='xRy' class='latex' /> if and only if <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)&#92;in X' title='(x,y)&#92;in X' class='latex' />. So every method of relating pairs of elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> gives rise to a subset of <img src='http://s0.wp.com/latex.php?latex=A%5Ctimes+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;times A' title='A&#92;times A' class='latex' /> and vice versa.</p>
<p>The exercise I have just carried out for relations tends not to be hard to carry out for other concepts. To give one other example, a real sequence <img src='http://s0.wp.com/latex.php?latex=a_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_n' title='a_n' class='latex' /> gives us the function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BN%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{N}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{N}&#92;to&#92;mathbb{R}' class='latex' /> defined by <img src='http://s0.wp.com/latex.php?latex=f%28n%29%3Da_n%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(n)=a_n,' title='f(n)=a_n,' class='latex' /> and a function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BN%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{N}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{N}&#92;to&#92;mathbb{R}' class='latex' /> gives us the real sequence <img src='http://s0.wp.com/latex.php?latex=%28a_n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(a_n)' title='(a_n)' class='latex' /> defined by <img src='http://s0.wp.com/latex.php?latex=a_n%3Df%28n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_n=f(n)' title='a_n=f(n)' class='latex' />.</p>
<p>What do I mean by (iii)? This is easier to say for some examples than it is for others. A particularly easy case is ordered pairs. What is the point of defining the ordered pair <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)' title='(x,y)' class='latex' /> as the set <img src='http://s0.wp.com/latex.php?latex=%5C%7Bx%2C%5C%7Bx%2Cy%5C%7D%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{x,&#92;{x,y&#92;}&#92;}' title='&#92;{x,&#92;{x,y&#92;}&#92;}' class='latex' />? It&#8217;s that, officially at least, mathematicians don&#8217;t like having too many <em>primitive</em> concepts &#8212; that is, concepts that can&#8217;t be defined in terms of lower-level concepts &#8212; so they try to build everything up from sets. </p>
<p>So far so good, but what makes us choose <em>that</em> funny set to count as &#8220;the ordered pair&#8221; <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)' title='(x,y)' class='latex' />? Well, what is the main thing we care about when we deal with ordered pairs? It&#8217;s that <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29%3D%28z%2Cw%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)=(z,w)' title='(x,y)=(z,w)' class='latex' /> if and only if <img src='http://s0.wp.com/latex.php?latex=x%3Dz&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=z' title='x=z' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y%3Dw&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=w' title='y=w' class='latex' />. It&#8217;s a simple exercise to show that the sets <img src='http://s0.wp.com/latex.php?latex=%5C%7Bx%2C%5C%7Bx%2Cy%5C%7D%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{x,&#92;{x,y&#92;}&#92;}' title='&#92;{x,&#92;{x,y&#92;}&#92;}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5C%7Bz%2C%5C%7Bz%2Cw%5C%7D%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{z,&#92;{z,w&#92;}&#92;}' title='&#92;{z,&#92;{z,w&#92;}&#92;}' class='latex' /> are equal if and only if <img src='http://s0.wp.com/latex.php?latex=x%3Dz&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=z' title='x=z' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y%3Dw&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=w' title='y=w' class='latex' />, so this set-theoretic construction gives us a way of defining a set-theoretic object that has the key property we want of ordered pairs.</p>
<p>One consequence of the fact that it&#8217;s really the <em>properties</em> we are interested in rather than the objects themselves, is that we can &#8220;define&#8221; the same concept in more than one way. For example, I could define the ordered pair <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)' title='(x,y)' class='latex' /> to be the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B%5C%7Bx%5C%7D%2C%5C%7Bx%2Cy%5C%7D%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{&#92;{x&#92;},&#92;{x,y&#92;}&#92;}' title='&#92;{&#92;{x&#92;},&#92;{x,y&#92;}&#92;}' class='latex' /> instead. That would again have the required property.</p>
<p>Real numbers can be &#8220;defined&#8221; in several ways. I mentioned the Cauchy-sequences definition above, but another well-known one is the notion of a <em>Dedekind cut</em>. We define a real number to be a partition of the rational numbers into two sets <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> such that every element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is less than every element of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />. </p>
<p>How does this correspond to what you might think of as a real number <img src='http://s0.wp.com/latex.php?latex=t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t' title='t' class='latex' />? Well, given your number <img src='http://s0.wp.com/latex.php?latex=t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t' title='t' class='latex' />, you can define <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> to be the set of all rationals less than <img src='http://s0.wp.com/latex.php?latex=t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t' title='t' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> to be the set of all rationals greater than or equal to <img src='http://s0.wp.com/latex.php?latex=t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t' title='t' class='latex' />. In the other direction, given two sets <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> with that property, you can calculate the decimal expansion of a real number by using the following procedure. Start with the biggest integer that belongs to <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />. Let&#8217;s say it is 3. Now take the biggest multiple of 0.1 that belongs to <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />. Let&#8217;s say that is <img src='http://s0.wp.com/latex.php?latex=3.1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3.1' title='3.1' class='latex' />. Then take the biggest multiple of <img src='http://s0.wp.com/latex.php?latex=0.01&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='0.01' title='0.01' class='latex' />. Let&#8217;s say that is <img src='http://s0.wp.com/latex.php?latex=3.14&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3.14' title='3.14' class='latex' />. Continuing in this way, we build up the decimal expansion of a number <img src='http://s0.wp.com/latex.php?latex=t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t' title='t' class='latex' /> that is at least as big as every number in <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />. </p>
<p>What properties do we want real numbers to have? The answer is that we want them to have the kinds of arithmetic properties we expect &#8212; things like <img src='http://s0.wp.com/latex.php?latex=x%28y%2Bz%29%3Dxy%2Bxz&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x(y+z)=xy+xz' title='x(y+z)=xy+xz' class='latex' /> &#8212; and to have the property that every increasing sequence that is bounded above converges to a limit. If you don&#8217;t know what that means, it doesn&#8217;t matter too much here. What matters here is that it is an axiom on which is built the theory of real analysis, which you will be doing next term. There are certain properties that turn out to imply all the other statements we want to make about real numbers, and Dedekind cuts are a way of showing that if we&#8217;ve got the rationals and we&#8217;ve got some set theory, then we don&#8217;t have to introduce any <em>new</em> objects. (The rationals themselves are defined in terms of the integers, which are defined in terms of the natural numbers, which are defined in terms of sets.)</p>
<p><strong>4. Calculation definitions.</strong></p>
<p>I commented above that one can define ordered pairs or real numbers, or many other mathematical concepts, in several different ways. By that I meant <em>really</em> different: an equivalence class of Cauchy sequences of rational numbers is not the same thing as a Dedekind cut, but either will serve as a construction-definition of a real number.</p>
<p>There is another kind of non-uniqueness that frequently occurs when we want to define a number or function. For example, <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> can be defined as <img src='http://s0.wp.com/latex.php?latex=4%281-1%2F3%2B1%2F5-1%2F7%2B%5Cdots%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4(1-1/3+1/5-1/7+&#92;dots)' title='4(1-1/3+1/5-1/7+&#92;dots)' class='latex' />, or it can be defined as the area of a unit circle (which itself can be defined using an integral). It is far from obvious that these two definitions result in the same number, but a bit of theory shows that they do.</p>
<p>An example of a function that can be defined in more than one way is <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' />. Here are four ways of defining it.</p>
<p>(1) Let <img src='http://s0.wp.com/latex.php?latex=e&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e' title='e' class='latex' /> be the number <img src='http://s0.wp.com/latex.php?latex=%5Cfrac+1%7B0%21%7D%2B%5Cfrac+1%7B1%21%7D%2B%5Cfrac+1%7B2%21%7D%2B%5Cfrac+1%7B3%21%7D%2B%5Cdots&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;frac 1{0!}+&#92;frac 1{1!}+&#92;frac 1{2!}+&#92;frac 1{3!}+&#92;dots' title='&#92;frac 1{0!}+&#92;frac 1{1!}+&#92;frac 1{2!}+&#92;frac 1{3!}+&#92;dots' class='latex' /></p>
<p>For every positive integer <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> define <img src='http://s0.wp.com/latex.php?latex=e%5En&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^n' title='e^n' class='latex' /> to be <img src='http://s0.wp.com/latex.php?latex=e.e%5E%7Bn-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e.e^{n-1}' title='e.e^{n-1}' class='latex' />, with <img src='http://s0.wp.com/latex.php?latex=e%5E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^0' title='e^0' class='latex' /> defined to be 1.</p>
<p>For every pair of positive integers <img src='http://s0.wp.com/latex.php?latex=p%2Cq&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p,q' title='p,q' class='latex' /> define <img src='http://s0.wp.com/latex.php?latex=e%5E%7Bp%2Fq%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^{p/q}' title='e^{p/q}' class='latex' /> to be the <img src='http://s0.wp.com/latex.php?latex=q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q' title='q' class='latex' />th root of <img src='http://s0.wp.com/latex.php?latex=e%5Ep&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^p' title='e^p' class='latex' />.</p>
<p>Finally, given a real number <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />, let <img src='http://s0.wp.com/latex.php?latex=%28p_n%2Fq_n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(p_n/q_n)' title='(p_n/q_n)' class='latex' /> be a sequence of rational numbers converging to <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and define <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' /> to be the limit of the numbers <img src='http://s0.wp.com/latex.php?latex=e%5E%7Bp_n%2Fq_n%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^{p_n/q_n}' title='e^{p_n/q_n}' class='latex' />.</p>
<p>(2) Define <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' /> to be <img src='http://s0.wp.com/latex.php?latex=1%2Bx%2B%5Cfrac%7Bx%5E2%7D%7B2%21%7D%2B%5Cfrac%7Bx%5E3%7D%7B3%21%7D%2B%5Cdots&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1+x+&#92;frac{x^2}{2!}+&#92;frac{x^3}{3!}+&#92;dots' title='1+x+&#92;frac{x^2}{2!}+&#92;frac{x^3}{3!}+&#92;dots' class='latex' /></p>
<p>(3) Define <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' /> to be the unique solution of the differential equation <img src='http://s0.wp.com/latex.php?latex=f%27%28x%29%3Df%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f&#039;(x)=f(x)' title='f&#039;(x)=f(x)' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%280%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(0)=1' title='f(0)=1' class='latex' />. </p>
<p>(4) Define <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' /> to be the limit as <img src='http://s0.wp.com/latex.php?latex=n%5Cto%5Cinfty&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;to&#92;infty' title='n&#92;to&#92;infty' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=%281%2Bx%2Fn%29%5En&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1+x/n)^n' title='(1+x/n)^n' class='latex' />.</p>
<p>Now it might seem that <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' /> is a pre-existing function, and that these definitions are just ways of calculating <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' />. But that isn&#8217;t really the point of these definitions. The point is that it is incredibly useful to us to have a continuous function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> with the property that <img src='http://s0.wp.com/latex.php?latex=f%28x%2By%29%3Df%28x%29f%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x+y)=f(x)f(y)' title='f(x+y)=f(x)f(y)' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />. It is an exercise to prove that if such a function exists, then it is determined by a single parameter (such as its value at 1, or the value of its derivative at 0). The above definitions are four routes to proving that it exists. You will probably be given Definition (2) as the &#8220;official&#8221; definition of <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' /> (though I myself prefer to use Definition (4)). Whichever definition you choose, one of the first things you do is prove that <img src='http://s0.wp.com/latex.php?latex=f%28x%2By%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x+y)' title='f(x+y)' class='latex' /> is always equal to <img src='http://s0.wp.com/latex.php?latex=f%28x%29f%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)f(y)' title='f(x)f(y)' class='latex' />. That pins your function down to one of the form <img src='http://s0.wp.com/latex.php?latex=a%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a^x' title='a^x' class='latex' /> for some real number <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />. To get the right function, you need to impose one more condition, which can be done in many ways: perhaps the simplest is to insist that <img src='http://s0.wp.com/latex.php?latex=f%27%280%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f&#039;(0)=1' title='f&#039;(0)=1' class='latex' />.</p>
<p>In general, when you are presented with a calculation-definition, for example of some new function, I strongly recommend that you pay close attention to the basic properties that your lecturer goes on to prove. Very often these determine the function uniquely and are what you use in practice when you are proving further things about the function. </p>
<p>With the exponential function, it is profitable to think of <img src='http://s0.wp.com/latex.php?latex=f%28x%2By%29%3Df%28x%29f%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x+y)=f(x)f(y)' title='f(x+y)=f(x)f(y)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=f%27%280%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f&#039;(0)=1' title='f&#039;(0)=1' class='latex' /> as &#8220;axioms for <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' />&#8220;. As an indication of how that is a useful point of view, let&#8217;s imagine that we have been given the power-series definition of <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' /> and are now faced with the task of proving that the derivative of <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' />. We can of course differentiate term by term, but then we need to know that that&#8217;s allowed. (It is, but it takes a little bit of work to prove it.) Another way of proceeding is first to show that <img src='http://s0.wp.com/latex.php?latex=e%5E%7Bx%2By%7D%3De%5Exe%5Ey&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^{x+y}=e^xe^y' title='e^{x+y}=e^xe^y' class='latex' /> and that the derivative at 0 is 1. Then the derivative at x, if it exists, is <img src='http://s0.wp.com/latex.php?latex=%5Clim_%7Bh%5Cto+0%7D%28e%5E%7Bx%2Bh%7D-e%5Ex%29%2Fh&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lim_{h&#92;to 0}(e^{x+h}-e^x)/h' title='&#92;lim_{h&#92;to 0}(e^{x+h}-e^x)/h' class='latex' />. This fraction equals <img src='http://s0.wp.com/latex.php?latex=e%5Ex%28e%5Eh-1%29%2Fh&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x(e^h-1)/h' title='e^x(e^h-1)/h' class='latex' />, so we see straight away that the derivative will be <img src='http://s0.wp.com/latex.php?latex=e%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^x' title='e^x' class='latex' /> times the derivative at 0. </p>
<p>That second proof probably ends up involving a similar amount of work to the first, but it has a big advantage, which is that it shows that the properties &#8220;differentiates to itself&#8221; and &#8220;turns addition into multiplication&#8221; are closely related. They aren&#8217;t just two properties that a function given by a funny formula happens to have.</p>
<p>Calculation-definitions are different from the construction-definitions discussed in the previous section, since what is calculated is the same thing for each definition. For instance, although definitions (1)-(4) above are different, they all define the same object &#8212; a certain function from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' />. By contrast, as already mentioned, different ways of defining ordered pairs or real numbers give you distinct mathematical objects (that nevertheless have the same important properties).</p>
<p>As with construction-definitions, calculation-definitions are  sometimes helpful for more than just the basic properties of what is being defined. For example, if I want to prove that <img src='http://s0.wp.com/latex.php?latex=e%5E1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e^1' title='e^1' class='latex' /> is irrational, then I will certainly avail myself of the power-series definition. Note that once we know that there is only one continuous function <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28x%2By%29%3Df%28x%29f%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x+y)=f(x)f(y)' title='f(x+y)=f(x)f(y)' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=f%27%280%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f&#039;(0)=1' title='f&#039;(0)=1' class='latex' />, we know that once we have established continuity and those properties for a new definition, we know that it gives us the same function as the other ones. In other words, we don&#8217;t have to prove that by means of some laborious calculation.</p>
<hr />
<p>I haven&#8217;t posted for a while now, so I&#8217;m going to post this, even though I think that there may be entire classes of definitions that I have not mentioned. However, the main thing I want to say about definitions &#8212; that some definitions don&#8217;t look like definitions at all &#8212; is so important that I am going to devote a separate post (the next one) to it.</p>
<p>To summarize, some definitions are mere abbreviations. The main purpose of some definitions is to pick out certain properties (e.g., saying that a triangle is <em>equilateral</em> if its three sides have the same length). Some definitions are constructions of mathematical objects that may look a little bizarre but are designed to have properties that enable them to model our pre-existing intuitive concepts. And some are ways of specifying numbers or functions that are again of interest mainly for their properties.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3568/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3568/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3568/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3568/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3568/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3568/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3568/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3568/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3568/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3568/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3568/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3568/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3568/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3568/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3568&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/10/23/definitions/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Permutations</title>
		<link>http://gowers.wordpress.com/2011/10/16/permutations/</link>
		<comments>http://gowers.wordpress.com/2011/10/16/permutations/#comments</comments>
		<pubDate>Sun, 16 Oct 2011 16:32:27 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[IA Groups]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3540</guid>
		<description><![CDATA[I don&#8217;t have too much to say about permutations, but there are two points that I have often found myself needing to get straight in supervisions. In fact, make that three. Here they are. [Added later: I have just finished the post, and it ended up being longer than I expected.] 1. The first is [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3540&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I don&#8217;t have too much to say about permutations, but there are two points that I have often found myself needing to get straight in supervisions. In fact, make that three. Here they are. [Added later: I have just finished the post, and it ended up being longer than I expected.]</p>
<p>1. The first is a confusion that some people have about what a permutation of <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cn%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,n&#92;}' title='&#92;{1,2,&#92;dots,n&#92;}' class='latex' /> actually <em>is</em>. What could possibly be the trouble, you might ask? Well, let&#039;s take the permutation that in cycle notation is written <img src='http://s0.wp.com/latex.php?latex=%28124%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(124)' title='(124)' class='latex' />. My guess is that a non-negligible percentage of people reading this have worried about whether this permutation means that you cycle round the <em>elements</em> 1, 2 and 4 of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cn%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,n&#92;}' title='&#92;{1,2,&#92;dots,n&#92;}' class='latex' /> or the <em>elements in the places</em> 1, 2 and 4.<br />
<span id="more-3540"></span></p>
<p>Before I go any further, let me say precisely what each of these alternatives means. To make sense of the first idea, we imagine placing the numbers in a line. Then if we want to apply the permutation <img src='http://s0.wp.com/latex.php?latex=%28124%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(124)' title='(124)' class='latex' />, we simply replace the number 1 by 2, the number 2 by 4, and the number 4 by 1. If we now want to apply the permutation <img src='http://s0.wp.com/latex.php?latex=%2845%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(45)' title='(45)' class='latex' />, say, we look for the numbers 4 and 5 (wherever they might be) and swap them. Also, if we started with the numbers in their usual order &#8212; that is, running from 1 to <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> &#8212; and want to know what the composition of all the permutations we&#8217;ve done so far is, we simply say that 1 maps to whatever is now in the first place, 2 maps to whatever is in the second place, and so on all the way up to <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />.</p>
<p>Let me illustrate this with <img src='http://s0.wp.com/latex.php?latex=n%3D6&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=6' title='n=6' class='latex' /> and those two permutations. I start with the numbers arranged as follows: 1 2 3 4 5 6. After doing the permutation <img src='http://s0.wp.com/latex.php?latex=%28124%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(124)' title='(124)' class='latex' /> the numbers are arranged as 2 4 3 1 5 6. Now if I do the permutation <img src='http://s0.wp.com/latex.php?latex=%2845%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(45)' title='(45)' class='latex' /> I have to swap 4 and 5, so I do so and obtain the arrangement 2 5 3 1 4 6. This is the permutation that sends 1 to 2, 2 to 5, 3 to 3, 4 to 1, 5 to 4 and 6 to 6. In cycle notation it is <img src='http://s0.wp.com/latex.php?latex=%281254%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1254)' title='(1254)' class='latex' />. (If you can&#8217;t see that quickly, then read on.)</p>
<p>Now let&#8217;s see what would have happened if instead of switching the <em>numbers</em> indicated, we had switched the <em>numbers in the places</em> indicated. We start with 1 2 3 4 5 6, and &#8220;apply <img src='http://s0.wp.com/latex.php?latex=%28124%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(124)' title='(124)' class='latex' />&#8221; to obtain 4 1 3 2 5 6. What I&#8217;ve done is move 1 to 2&#8242;s old place, 2 to 4&#8242;s old place, and 4 to 1&#8242;s old place. Now we interpret the second permutation <img src='http://s0.wp.com/latex.php?latex=%2845%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(45)' title='(45)' class='latex' /> as switching the numbers in the places 4 and 5, so we obtain this arrangement: 4 1 3 5 2 6. In cycle notation, this is the permutation <img src='http://s0.wp.com/latex.php?latex=%281452%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1452)' title='(1452)' class='latex' />, which is the inverse of the answer in the first interpretation. As we shall see, that is not a coincidence. (I ought to add that I&#8217;ve slightly cheated here: the cycle notation is <img src='http://s0.wp.com/latex.php?latex=%281452%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1452)' title='(1452)' class='latex' /> if we interpret the cycle in the correct way, but if we interpret it in the moving-numbers-in-places way then we would argue that the cycle notation of this permutation is in fact <img src='http://s0.wp.com/latex.php?latex=%281254%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1254)' title='(1254)' class='latex' />. However, it is correct that the two permutations are inverses of each other.)</p>
<p>So which is the right interpretation of what a permutation means? It is the first. Here are several reasons that that not only <em>is</em> the case but <em>clearly must have been</em> the case.</p>
<p>First, suppose we decided that we wanted to permute not the numbers from 1 to <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> but the elements of some other set. One simple example might be a different set of numbers, such as <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C3%2C5%2C7%2C9%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,3,5,7,9&#92;}' title='&#92;{1,3,5,7,9&#92;}' class='latex' />. Now we see that the permutation <img src='http://s0.wp.com/latex.php?latex=%28579%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(579)' title='(579)' class='latex' /> makes perfectly good sense if we think of switching numbers around, and no sense at all if we think of switching numbers in places around.</p>
<p>A second reason is that a permutation is defined as a bijection of a set. When we take a set, we don&#8217;t have to arrange its elements in a line. If we did have to, it would be an ordered set that we were talking about. So the switching-numbers interpretation makes sense for the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cn%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,n&#92;}' title='&#92;{1,2,&#92;dots,n&#92;}' class='latex' /> even if those numbers are scattered around in space, or put down in a funny order, or just not in any kind of &#8220;space&#8221; at all. </p>
<p>Is that the end of the story? I think it isn&#8217;t enough just to say, &#8220;If you thought permutations were about where the numbers are, then you were wrong: go away and learn the definition properly.&#8221; That isn&#8217;t terrible advice, but somehow it&#8217;s a bit unfriendly, and misses an opportunity to talk about some interesting mathematics. Here are two questions that are worth discussing.</p>
<p>(i) What is it that causes this particular confusion?</p>
<p>(ii) Is there any interesting relationship between the two ways of thinking about permutations? After all, even if the second one isn&#8217;t correct, it still makes sense.</p>
<p>I&#8217;ll tackle (ii) first. To begin with, I&#8217;d like to try to describe the &#8220;moving objects in certain places&#8221; idea in a mathematically precise way. To do that, let us define an <em>ordering</em> of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cn%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,n&#92;}' title='&#92;{1,2,&#92;dots,n&#92;}' class='latex' /> to be an ordered <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />-tuple that consists of the numbers 1 to <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> in some order. For example, the orderings of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C3%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,3&#92;}' title='&#92;{1,2,3&#92;}' class='latex' /> are <img src='http://s0.wp.com/latex.php?latex=%281%2C2%2C3%29%2C%281%2C3%2C2%29%2C%282%2C1%2C3%29%2C%282%2C3%2C1%29%2C%283%2C1%2C2%29%2C%283%2C2%2C1%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1,2,3),(1,3,2),(2,1,3),(2,3,1),(3,1,2),(3,2,1)' title='(1,2,3),(1,3,2),(2,1,3),(2,3,1),(3,1,2),(3,2,1)' class='latex' />. (A little exercise if you feel so inclined: what would be an economical way of describing the order in which I wrote those six orderings, and then generalizing it to orderings of larger sets of integers?)</p>
<p>Now, given a permutation <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cn%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,n&#92;}' title='&#92;{1,2,&#92;dots,n&#92;}' class='latex' /> we want to interpret it as something that moves objects that are sitting in certain places. We can do that as follows. I&#8217;ll give an example and then say in general what I&#8217;m doing. </p>
<p>Let&#8217;s take <img src='http://s0.wp.com/latex.php?latex=n%3D4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=4' title='n=4' class='latex' /> and let&#8217;s take the permutation <img src='http://s0.wp.com/latex.php?latex=%28134%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(134)' title='(134)' class='latex' />. I want to think of it as the function that takes an ordering of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C3%2C4%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,3,4&#92;}' title='&#92;{1,2,3,4&#92;}' class='latex' /> and moves whatever is in the first place to the third place, whatever is in the third place to the fourth place, and whatever is in the fourth place to the first place. So for example, if I apply my permutation <img src='http://s0.wp.com/latex.php?latex=%28134%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(134)' title='(134)' class='latex' /> to the ordering <img src='http://s0.wp.com/latex.php?latex=%282%2C1%2C4%2C3%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(2,1,4,3)' title='(2,1,4,3)' class='latex' /> I get the ordering <img src='http://s0.wp.com/latex.php?latex=%283%2C1%2C2%2C4%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(3,1,2,4)' title='(3,1,2,4)' class='latex' />. If I apply it to the ordering <img src='http://s0.wp.com/latex.php?latex=%283%2C2%2C4%2C1%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(3,2,4,1)' title='(3,2,4,1)' class='latex' /> I get the ordering <img src='http://s0.wp.com/latex.php?latex=%281%2C2%2C3%2C4%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1,2,3,4)' title='(1,2,3,4)' class='latex' />. And so on.</p>
<p>What&#8217;s going on in general? Well, an ordering of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cn%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,n&#92;}' title='&#92;{1,2,&#92;dots,n&#92;}' class='latex' /> takes the form <img src='http://s0.wp.com/latex.php?latex=%28f%281%29%2Cf%282%29%2C%5Cdots%2Cf%28n%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(f(1),f(2),&#92;dots,f(n))' title='(f(1),f(2),&#92;dots,f(n))' class='latex' /> for some bijection <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> from <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cn%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,n&#92;}' title='&#92;{1,2,&#92;dots,n&#92;}' class='latex' /> to itself. But hang on, bijections from <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C%5Cdots%2Cn%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,&#92;dots,n&#92;}' title='&#92;{1,2,&#92;dots,n&#92;}' class='latex' /> to itself are just permutations. So let&#8217;s write <img src='http://s0.wp.com/latex.php?latex=%28%5Csigma%281%29%2C%5Cdots%2C%5Csigma%28n%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(&#92;sigma(1),&#92;dots,&#92;sigma(n))' title='(&#92;sigma(1),&#92;dots,&#92;sigma(n))' class='latex' /> for a typical ordering instead.</p>
<p>Now let&#8217;s take a permutation <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> and use it in the objects-in-certain-places sense. That is, for each <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> I take the object in the <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />th place and move it to the object in the <img src='http://s0.wp.com/latex.php?latex=%5Cpi%28k%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(k)' title='&#92;pi(k)' class='latex' />th place. That is, I move <img src='http://s0.wp.com/latex.php?latex=%5Csigma%28k%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma(k)' title='&#92;sigma(k)' class='latex' /> (which is currently in the <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />th place) to the place <img src='http://s0.wp.com/latex.php?latex=%5Cpi%28k%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(k)' title='&#92;pi(k)' class='latex' />. </p>
<p>If I do that, then what object ends up in the first place? I need to find <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%5Cpi%28k%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(k)=1' title='&#92;pi(k)=1' class='latex' />, and then the object in question is <img src='http://s0.wp.com/latex.php?latex=%5Csigma%28k%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma(k)' title='&#92;sigma(k)' class='latex' /> for that <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' />. Well, the <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=%5Cpi%28k%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(k)=1' title='&#92;pi(k)=1' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=%5Cpi%5E%7B-1%7D%281%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi^{-1}(1)' title='&#92;pi^{-1}(1)' class='latex' />. It follows that in my notation for orderings, the object in the first place is going to be <img src='http://s0.wp.com/latex.php?latex=%5Csigma%28%5Cpi%5E%7B-1%7D%281%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma(&#92;pi^{-1}(1))' title='&#92;sigma(&#92;pi^{-1}(1))' class='latex' />. That argument obviously works for all numbers, so the eventual ordering is <img src='http://s0.wp.com/latex.php?latex=%28%5Csigma%28%5Cpi%5E%7B-1%7D%281%29%29%2C%5Csigma%28%5Cpi%5E%7B-1%7D%282%29%29%2C%5Cdots%2C%5Csigma%28%5Cpi%5E%7B-1%7D%28n%29%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(&#92;sigma(&#92;pi^{-1}(1)),&#92;sigma(&#92;pi^{-1}(2)),&#92;dots,&#92;sigma(&#92;pi^{-1}(n)))' title='(&#92;sigma(&#92;pi^{-1}(1)),&#92;sigma(&#92;pi^{-1}(2)),&#92;dots,&#92;sigma(&#92;pi^{-1}(n)))' class='latex' />.</p>
<p>In other words, if I want to apply <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> in a moving-places way to an ordering defined by a permutation, I need to multiply that permutation on the right by <img src='http://s0.wp.com/latex.php?latex=%5Cpi%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi^{-1}' title='&#92;pi^{-1}' class='latex' /> to get the permutation that defines the new ordering.</p>
<p>Let&#8217;s check that for the example we had earlier. There we started with the ordering <img src='http://s0.wp.com/latex.php?latex=%281%2C2%2C3%2C4%2C5%2C6%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1,2,3,4,5,6)' title='(1,2,3,4,5,6)' class='latex' /> and used the permutation <img src='http://s0.wp.com/latex.php?latex=%28124%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(124)' title='(124)' class='latex' /> to switch certain places around. The result was the ordering <img src='http://s0.wp.com/latex.php?latex=%284%2C1%2C3%2C2%2C5%2C6%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(4,1,3,2,5,6)' title='(4,1,3,2,5,6)' class='latex' />. What have we got in the first place here? We have 4, which is not the image of 1 under the permutation <img src='http://s0.wp.com/latex.php?latex=%28124%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(124)' title='(124)' class='latex' /> but the image of 1 under the <em>inverse</em> of the permutation <img src='http://s0.wp.com/latex.php?latex=%28124%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(124)' title='(124)' class='latex' />, exactly as we wanted.</p>
<p>Next, we used the permutation <img src='http://s0.wp.com/latex.php?latex=%2845%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(45)' title='(45)' class='latex' /> to switch the numbers in places 4 and 5. That gave us the ordering <img src='http://s0.wp.com/latex.php?latex=%284%2C1%2C3%2C5%2C2%2C6%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(4,1,3,5,2,6)' title='(4,1,3,5,2,6)' class='latex' />. What appears in the fifth place here? It should be the inverse of <img src='http://s0.wp.com/latex.php?latex=%28124%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(124)' title='(124)' class='latex' /> applied to the inverse of <img src='http://s0.wp.com/latex.php?latex=%2845%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(45)' title='(45)' class='latex' /> applied to the number 5. OK, take 5 and apply the inverse of <img src='http://s0.wp.com/latex.php?latex=%2845%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(45)' title='(45)' class='latex' />. It gives us 4. Now take 4 and apply the inverse of <img src='http://s0.wp.com/latex.php?latex=%28124%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(124)' title='(124)' class='latex' />. That gives us 2. Do we have 2 in the fifth place? Yes we do.</p>
<p>The main thing to take out of this is that if you start with the &#8220;identity ordering&#8221; <img src='http://s0.wp.com/latex.php?latex=%281%2C2%2C%5Cdots%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1,2,&#92;dots,n)' title='(1,2,&#92;dots,n)' class='latex' /> and use a permutation <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' /> to switch objects in certain places rather than the objects themselves, what you are actually doing is applying the inverse <img src='http://s0.wp.com/latex.php?latex=%5Csigma%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma^{-1}' title='&#92;sigma^{-1}' class='latex' /> of the permutation <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' />. But you have to be a little careful here. If you then apply a permutation <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> in the moving-places sense, the objects are no longer in their original positions. What that means is that the permutation we end up applying after the two operations is not <img src='http://s0.wp.com/latex.php?latex=%5Cpi%5E%7B-1%7D%5Csigma%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi^{-1}&#92;sigma^{-1}' title='&#92;pi^{-1}&#92;sigma^{-1}' class='latex' /> (which means do <img src='http://s0.wp.com/latex.php?latex=%5Csigma%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma^{-1}' title='&#92;sigma^{-1}' class='latex' /> first and then <img src='http://s0.wp.com/latex.php?latex=%5Cpi%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi^{-1}' title='&#92;pi^{-1}' class='latex' />) but rather <img src='http://s0.wp.com/latex.php?latex=%5Csigma%5E%7B-1%7D%5Cpi%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma^{-1}&#92;pi^{-1}' title='&#92;sigma^{-1}&#92;pi^{-1}' class='latex' />. </p>
<p>However, this isn&#8217;t too surprising. It tells us that if we do the operation associated with <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' /> and then the operation associated with <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' />, we end up doing the operation associated with <img src='http://s0.wp.com/latex.php?latex=%5Cpi%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi&#92;sigma' title='&#92;pi&#92;sigma' class='latex' />. (Later, when you have covered group actions, I will be able to explain all this much more concisely.)</p>
<p>I don&#8217;t feel confident that I&#8217;ve found the neatest and clearest explanation of the relationship, even if I don&#8217;t allow myself to talk about group actions, but if you are still not clear in your mind how the two ways of getting permutations to do things are related, then I recommend spending some time trying to work it out for yourself. But you may also find what I&#8217;m about to say about question (i) helpful too, which is that there is a very good reason that people find themselves wondering whether the moving-objects-between-places viewpoint is the right one.</p>
<p>One of the things you will be told soon, if you haven&#8217;t been told it already, is that the permuation group <img src='http://s0.wp.com/latex.php?latex=S_3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S_3' title='S_3' class='latex' />, which consists of all permutations of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C3%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,3&#92;}' title='&#92;{1,2,3&#92;}' class='latex' />, is isomorphic to the symmetry group of an equilateral triangle. Here&#8217;s the rough reason for that. If you take an equilateral triangle and number its three vertices 1, 2 and 3, then any symmetry swaps those three vertices around, and conversely any of the six ways of swapping those three vertices around can be achieved by means of a symmetry. (For example, if we reflect in the line that goes through vertex 1 and the centre of the triangle, then we end up swapping vertices 2 and 3.)</p>
<p>Let&#8217;s try to be more precise about this. The symmetries of an equilateral triangle are three reflections and three rotations (counting the identity map as a rotation through an angle of zero). Which permutation of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C3%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,3&#92;}' title='&#92;{1,2,3&#92;}' class='latex' /> should correspond to which symmetry? For example, if the triangle has a horizontal base, which permutation corresponds to a reflection through a vertical line that cuts the triangle in half?</p>
<p>To answer this we should think about what that reflection does to the vertices, and to think about that we should probably number the vertices. Perhaps the vertex at the top could be number 1, the bottom left one could be number 2 and the bottom right one could be number 3. So reflecting in a vertical line through the top vertex looks as though it ought to correspond to the permutation <img src='http://s0.wp.com/latex.php?latex=%2823%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(23)' title='(23)' class='latex' />. </p>
<p>How about a rotation through 120 degrees anticlockwise? That takes vertex 1 to vertex 2, vertex 2 to vertex 3 and vertex 3 to vertex 1. Does that mean that the corresponding permutation is <img src='http://s0.wp.com/latex.php?latex=%28123%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(123)' title='(123)' class='latex' />? Let&#8217;s suppose it does. If we do that rotation first, then how do we reflect in a vertical line? Now the vertex 3 is at the top, so does that mean that reflection in a vertical line has suddenly become the permutation <img src='http://s0.wp.com/latex.php?latex=%2812%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)' title='(12)' class='latex' /> instead of <img src='http://s0.wp.com/latex.php?latex=%2823%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(23)' title='(23)' class='latex' />?</p>
<p>We&#8217;re starting to get into a muddle, and the muddle is precisely the one I&#8217;ve been talking about. If we think of the symmetries of a triangle as things like &#8220;reflect in a vertical line through the top vertex&#8221; then what we care about is not what the number of that vertex happens to be, but the <em>place</em> at the top. So we can&#8217;t think of &#8220;reflect in a vertical line through the top vertex&#8221; as mapping vertex number 2 to vertex number 3 and vertex number 3 to vertex number 2. </p>
<p>So is it correct that the symmetry group of a triangle is isomorphic to the permutation group of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%2C3%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2,3&#92;}' title='&#92;{1,2,3&#92;}' class='latex' />? Yes it is, but the isomorphism isn&#8217;t quite what you think. First you have to number the <em>possible positions</em> of the vertices as 1, 2 and 3, and then a permutation such as <img src='http://s0.wp.com/latex.php?latex=%28132%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(132)' title='(132)' class='latex' /> corresponds to the map that sends the vertex in position 1 to the vertex in position 3, the vertex in position 3 to the vertex in position 2, and the vertex in position 2 to the vertex in position 1. What it doesn&#8217;t correspond to is the map that takes the vertex <em>labelled</em> 1 to the vertex <em>labelled</em> 3, etc. </p>
<p>This is actually the same phenomenon as one you will have come across at school. If you have a graph of a function <img src='http://s0.wp.com/latex.php?latex=y%3Df%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=f(x)' title='y=f(x)' class='latex' /> and you want to translate it to the right by <img src='http://s0.wp.com/latex.php?latex=t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t' title='t' class='latex' />, it&#8217;s natural to think that what you ought to do is take the graph of the function <img src='http://s0.wp.com/latex.php?latex=y%3Df%28x%2Bt%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=f(x+t)' title='y=f(x+t)' class='latex' />. However, this is wrong: you need to take the graph of the function <img src='http://s0.wp.com/latex.php?latex=y%3Df%28x-t%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=f(x-t)' title='y=f(x-t)' class='latex' />. Why? Well, if you want the values taken by the function to move to the right, you want the value of the new function at <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to be the value of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> at <img src='http://s0.wp.com/latex.php?latex=x-t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x-t' title='x-t' class='latex' />. Similarly, if you want to rotate the labels of the vertices of a triangle through 120 degrees anticlockwise, then you want the new label at a vertex to be the old label at the vertex 120 degrees <em>clockwise</em> from it.</p>
<p>What is the take-home message from all this? The main one is that if you are ever confused by the notion of a permutation, then simply replace the phrase &#8220;permutation of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />&#8221; by &#8220;bijection from <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />&#8220;. If you do that, then you won&#8217;t be tempted to think that the statement <img src='http://s0.wp.com/latex.php?latex=%5Cpi%28n%29%3Dm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(n)=m' title='&#92;pi(n)=m' class='latex' /> means that you are moving the object in place <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> to place <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' />. But a secondary message is that we often do move objects according to <em>where</em> they are rather than <em>what</em> they are, and as long as you are careful you can represent this with permutations as well.</p>
<p>2. The second point I wanted to make was one that applies much more generally than just to permutations. It&#8217;s that there is an important distinction between proofs that tell you that something can be done, and proofs that tell you how to do it. In your lectures, you were told about the cycle notation for permutations. So a fact that you should now be aware of is this.</p>
<li>Every permutation of a finite set is a product of disjoint cycles.</li>
<p>However, the only proofs I can think of of this fact tell you more: they explain how to work out what those disjoint cycles are. And while that means that your lecturer is pretty well guaranteed to have told you how to work out the cycle representation of a permutation, I think it&#8217;s worth emphasizing just how easy this algorithm is to apply.</p>
<p>To illustrate this, let me give two examples. The first involves arithmetic mod 15. I&#8217;m not sure whether you&#8217;ve got on to this in Numbers and Sets, but if you haven&#8217;t, it&#8217;s not a problem. All you have to understand to follow what I&#8217;m about to say is that I am going to define a function <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> from <img src='http://s0.wp.com/latex.php?latex=%5C%7B0%2C1%2C2%2C%5Cdots%2C14%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{0,1,2,&#92;dots,14&#92;}' title='&#92;{0,1,2,&#92;dots,14&#92;}' class='latex' /> to itself by taking <img src='http://s0.wp.com/latex.php?latex=%5Cpi%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(x)' title='&#92;pi(x)' class='latex' /> to be the result of multiplying <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> by 7 and taking the remainder on division by 15. For example, to work out <img src='http://s0.wp.com/latex.php?latex=%5Cpi%288%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(8)' title='&#92;pi(8)' class='latex' /> we multiply by 7 to get 56, and since <img src='http://s0.wp.com/latex.php?latex=56%3D3%5Ctimes+15%2B11&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='56=3&#92;times 15+11' title='56=3&#92;times 15+11' class='latex' /> we find that <img src='http://s0.wp.com/latex.php?latex=%5Cpi%288%29%3D11&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(8)=11' title='&#92;pi(8)=11' class='latex' />. </p>
<p>For reasons that will soon be explained to you if they haven&#8217;t already, <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> is a bijection, or we could if we like call it a permutation. I chose it because it is not defined as a product of disjoint cycles, so if we want to show that it <em>is</em> a product of disjoint cycles, then we have to work something out.</p>
<p>Here&#8217;s how to do it. Start with 0. Since <img src='http://s0.wp.com/latex.php?latex=%5Cpi%280%29%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(0)=0' title='&#92;pi(0)=0' class='latex' />, we end up with a cycle of length 1, which in cycle notation one doesn&#8217;t usually bother to write down. So that&#8217;s dealt with one cycle. Now let&#8217;s take the smallest number we haven&#8217;t yet dealt with, which is 1. Multiplying that by 7 gives us 7, which is smaller than 15, so <img src='http://s0.wp.com/latex.php?latex=%5Cpi%281%29%3D7&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(1)=7' title='&#92;pi(1)=7' class='latex' />. That tells us we&#8217;ve got a cycle of length greater than 1, so let&#8217;s start writing it out: <img src='http://s0.wp.com/latex.php?latex=%2817&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(17' title='(17' class='latex' />. Next we work out <img src='http://s0.wp.com/latex.php?latex=%5Cpi%287%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(7)' title='&#92;pi(7)' class='latex' />, which is 4, since <img src='http://s0.wp.com/latex.php?latex=7%5Ctimes+7%3D45%2B4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='7&#92;times 7=45+4' title='7&#92;times 7=45+4' class='latex' />, which means that we can write out more of the cycle, obtaining <img src='http://s0.wp.com/latex.php?latex=%28174&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(174' title='(174' class='latex' />. Continuing, we find that <img src='http://s0.wp.com/latex.php?latex=%5Cpi%284%29%3D13&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(4)=13' title='&#92;pi(4)=13' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cpi%2813%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(13)=1' title='&#92;pi(13)=1' class='latex' />. Once we&#8217;ve got back to 1, we have finished the cycle. Since 13 is a two-digit number, let me put in commas for the sake of clarity: we end up writing the cycle <img src='http://s0.wp.com/latex.php?latex=%281%2C7%2C4%2C13%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1,7,4,13)' title='(1,7,4,13)' class='latex' />. (This means that 1 goes to 7 goes to 4 goes to 13 goes to 1.)</p>
<p>The smallest number we haven&#8217;t yet dealt with is 2, so off we go. A moment&#8217;s thought tells us that our numbers will be twice the numbers in the previous cycle, except that twice 13 will be 11 after we have taken the remainder on division by 15. So we get the cycle <img src='http://s0.wp.com/latex.php?latex=%282%2C14%2C8%2C11%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(2,14,8,11)' title='(2,14,8,11)' class='latex' />. Next we deal with 3. Multiplying the first cycle by 3 (and taking remainders if necessary) gives us <img src='http://s0.wp.com/latex.php?latex=%283%2C6%2C12%2C9%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(3,6,12,9)' title='(3,6,12,9)' class='latex' />. Now the smallest number we haven&#8217;t dealt with is 5. If we multiply everything in the first cycle by 5, we get <img src='http://s0.wp.com/latex.php?latex=%285%2C5%2C5%2C5%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(5,5,5,5)' title='(5,5,5,5)' class='latex' />. That looks a bit odd, perhaps: it demonstrates the fact that the cancellation law is not valid for multiplication by 5 mod 15. For the purposes of the cycle representation it tells us that 5 is part of a 1-cycle for this permutation, and the same will apply to 10. Therefore, the cycle representation of the entire permutation is <img src='http://s0.wp.com/latex.php?latex=%281%2C7%2C4%2C13%29%282%2C14%2C8%2C11%29%283%2C6%2C12%2C9%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1,7,4,13)(2,14,8,11)(3,6,12,9)' title='(1,7,4,13)(2,14,8,11)(3,6,12,9)' class='latex' />.</p>
<p>The second example is a product of three cycles that are <em>not</em> disjoint. Suppose I give you the permutation <img src='http://s0.wp.com/latex.php?latex=%281234%29%282468%29%28347%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1234)(2468)(347)' title='(1234)(2468)(347)' class='latex' /> and ask you to write it in cycle notation. That requires disjoint cycles, so we have to do something. With a bit of experience, you can simply <em>write down</em> the answer with no working at all. Let me do so, and then I&#8217;ll say what I did. The answer is <img src='http://s0.wp.com/latex.php?latex=%2812%29%28368%29%2847%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(12)(368)(47)' title='(12)(368)(47)' class='latex' />. How did I get that? I simply followed the procedure above: I started by seeing what 1 went to, then what that went to, and so on. When I completed the first cycle, I found that 3 was the smallest number I hadn&#8217;t yet discussed, so I saw what that went to, then what that went to, etc. The result was the product of three disjoint cycles above. </p>
<p>How did I see what the various numbers went to? The one thing to remember here is that the cycles represent functions and if you are given a composition of functions then you start from the right. So if we want to know what 3 goes to when we do the permutation <img src='http://s0.wp.com/latex.php?latex=%281234%29%282468%29%28347%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1234)(2468)(347)' title='(1234)(2468)(347)' class='latex' /> we first look at the rightmost bracket, which sends 3 to 4. Then we look at the middle bracket, which sends 4 to 6, and finally we look at the first bracket, which doesn&#8217;t do anything to 6. That tells us that the above permutation sends 3 to 6. It is easy to do that kind of calculation in one&#8217;s head, which is why it is easy to write down the disjoint cycle representation of a permutation when it is given in another form.</p>
<p>3. The third point I wanted to discuss was conjugating permutations. I plan to have a post about conjugation later, but basically conjugation is to do with expressions like <img src='http://s0.wp.com/latex.php?latex=ABA%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ABA^{-1}' title='ABA^{-1}' class='latex' />, which come up all over the place in mathematics. (Why? That is what I want to explain in the post on conjugation.) </p>
<p>We sometimes like to conjugate permutations, so it is useful to know how to work out what <img src='http://s0.wp.com/latex.php?latex=%5Cpi%5Csigma%5Cpi%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi&#92;sigma&#92;pi^{-1}' title='&#92;pi&#92;sigma&#92;pi^{-1}' class='latex' /> is when <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' /> are given permutations. What&#8217;s more, the answer is very easy.</p>
<p>Let me illustrate it with an example that is half abstract and half concrete. I&#8217;d like to ask what the cycle representation is of the permutation <img src='http://s0.wp.com/latex.php?latex=%5Cpi%281356%29%2824%29%5Cpi%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(1356)(24)&#92;pi^{-1}' title='&#92;pi(1356)(24)&#92;pi^{-1}' class='latex' />. In other words, I&#8217;m setting <img src='http://s0.wp.com/latex.php?latex=%5Csigma%3D%281356%29%2824%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma=(1356)(24)' title='&#92;sigma=(1356)(24)' class='latex' /> and I&#8217;m not telling you what <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> is. </p>
<p>To work this out, we apply the same method yet again, but with a slight twist. This time I don&#8217;t find it all that easy to say what 1 goes to. However, I do find it easy to say what <img src='http://s0.wp.com/latex.php?latex=%5Cpi%281%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(1)' title='&#92;pi(1)' class='latex' /> goes to. Working from the right, we first apply <img src='http://s0.wp.com/latex.php?latex=%5Cpi%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi^{-1}' title='&#92;pi^{-1}' class='latex' />, which takes <img src='http://s0.wp.com/latex.php?latex=%5Cpi%281%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(1)' title='&#92;pi(1)' class='latex' /> to 1. Next we apply <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' />, which takes 1 to 3. And finally we apply <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' />, which takes <img src='http://s0.wp.com/latex.php?latex=3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3' title='3' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cpi%283%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(3)' title='&#92;pi(3)' class='latex' />. So <img src='http://s0.wp.com/latex.php?latex=%5Cpi%281%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(1)' title='&#92;pi(1)' class='latex' /> goes to <img src='http://s0.wp.com/latex.php?latex=%5Cpi%283%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(3)' title='&#92;pi(3)' class='latex' />. What does <img src='http://s0.wp.com/latex.php?latex=%5Cpi%283%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(3)' title='&#92;pi(3)' class='latex' /> go to? Exactly the same argument shows that it goes to <img src='http://s0.wp.com/latex.php?latex=%5Cpi%285%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(5)' title='&#92;pi(5)' class='latex' />, and a clear pattern emerges: if <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' /> sends <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=%5Cpi%5Csigma%5Cpi%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi&#92;sigma&#92;pi^{-1}' title='&#92;pi&#92;sigma&#92;pi^{-1}' class='latex' /> sends <img src='http://s0.wp.com/latex.php?latex=%5Cpi%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(a)' title='&#92;pi(a)' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cpi%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(b)' title='&#92;pi(b)' class='latex' />. (Just try it: <img src='http://s0.wp.com/latex.php?latex=%5Cpi%28a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(a)' title='&#92;pi(a)' class='latex' /> goes to <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' />, which goes to <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' />, which goes to <img src='http://s0.wp.com/latex.php?latex=%5Cpi%28b%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(b)' title='&#92;pi(b)' class='latex' />.) That tells us that to get the cycle representation of <img src='http://s0.wp.com/latex.php?latex=%5Cpi%5Csigma%5Cpi%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi&#92;sigma&#92;pi^{-1}' title='&#92;pi&#92;sigma&#92;pi^{-1}' class='latex' />, we just take the cycle representation of <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' /> and do <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> to everything. In our example we start with the permutation <img src='http://s0.wp.com/latex.php?latex=%281356%29%2824%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1356)(24)' title='(1356)(24)' class='latex' /> and the end result is the permutation <img src='http://s0.wp.com/latex.php?latex=%28%5Cpi%281%29%5Cpi%283%29%5Cpi%285%29%5Cpi%286%29%29%28%5Cpi%282%29%5Cpi%284%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(&#92;pi(1)&#92;pi(3)&#92;pi(5)&#92;pi(6))(&#92;pi(2)&#92;pi(4))' title='(&#92;pi(1)&#92;pi(3)&#92;pi(5)&#92;pi(6))(&#92;pi(2)&#92;pi(4))' class='latex' />.</p>
<p>Now let me make the example completely concrete by taking <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> to be the 4-cycle <img src='http://s0.wp.com/latex.php?latex=%281234%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1234)' title='(1234)' class='latex' />. The rule above tells us that we take the cycle representation <img src='http://s0.wp.com/latex.php?latex=%281356%29%2824%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1356)(24)' title='(1356)(24)' class='latex' /> and change 1 to 2, 2 to 3, 3 to 4 and 4 to 1. That is,</p>
<p><img src='http://s0.wp.com/latex.php?latex=%281234%29%281356%29%2824%29%281234%29%5E%7B-1%7D%3D%282456%29%2831%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1234)(1356)(24)(1234)^{-1}=(2456)(31)' title='(1234)(1356)(24)(1234)^{-1}=(2456)(31)' class='latex' />,</p>
<p>which we might prefer to write as <img src='http://s0.wp.com/latex.php?latex=%282456%29%2813%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(2456)(13)' title='(2456)(13)' class='latex' />. </p>
<p>An important general fact that this rule tells us is that the cycle type of a permutation is unaffected by conjugation. (The cycle type means the information about how many cycles there are of each length.) That tells me, for example, that there is no <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> with the property that <img src='http://s0.wp.com/latex.php?latex=%5Cpi%2812%29%5Cpi%5E%7B-1%7D%3D%28123%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(12)&#92;pi^{-1}=(123)' title='&#92;pi(12)&#92;pi^{-1}=(123)' class='latex' />. On the other hand, if I want to find a permutation <img src='http://s0.wp.com/latex.php?latex=%5Cpi&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi' title='&#92;pi' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%5Cpi%2812%29%5Cpi%5E%7B-1%7D%3D%2834%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;pi(12)&#92;pi^{-1}=(34)' title='&#92;pi(12)&#92;pi^{-1}=(34)' class='latex' />, there is no problem. All I need is some permutation that will take 1 to 3 and 2 to 4. A permutation that will do the job is <img src='http://s0.wp.com/latex.php?latex=%2813%29%2824%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(13)(24)' title='(13)(24)' class='latex' />. Another is <img src='http://s0.wp.com/latex.php?latex=%281324%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(1324)' title='(1324)' class='latex' />. Why this is useful to know will become clear only later in the course.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3540/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3540/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3540/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3540/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3540/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3540/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3540/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3540&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/10/16/permutations/feed/</wfw:commentRss>
		<slash:comments>29</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Domains, codomains, ranges, images, preimages, inverse images</title>
		<link>http://gowers.wordpress.com/2011/10/13/domains-codomains-ranges-images-preimages-inverse-images/</link>
		<comments>http://gowers.wordpress.com/2011/10/13/domains-codomains-ranges-images-preimages-inverse-images/#comments</comments>
		<pubDate>Thu, 13 Oct 2011 15:35:08 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[General concepts]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3520</guid>
		<description><![CDATA[If I were writing a textbook, I would have discussed the basics of functions before talking about injections and surjections, but this is not a textbook &#8212; it is a series of blog posts that provide a kind of commentary on some of the lecture courses. However, now that I have got on to the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3520&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>If I were writing a textbook, I would have discussed the basics of functions before talking about injections and surjections, but this is not a textbook &#8212; it is a series of blog posts that provide a kind of commentary on some of the lecture courses. However, now that I have got on to the subject of functions, it probably makes sense to discuss them a bit more, especially as I hear that they have made an appearance in Numbers and Sets.</p>
<p>Let me start with the most basic question of all: what <em>is</em> a function? This is one of the first examples (of many, unfortunately) of a concept that you were probably reasonably happy with until your lecturer explained it to you. This is absolutely not a criticism of your lecturer (who is excellent, by the way). It&#8217;s more like a criticism of an entire mathematical tradition that goes back to the days when the foundations were laid for our subject in terms of set theory.<br />
<span id="more-3520"></span></p>
<p>Let&#8217;s have some examples of functions. The ones you are likely to have come across are ones that take real numbers to real numbers: things like <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dx%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=x^2' title='f(x)=x^2' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=g%28x%29%3De%5Ex&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(x)=e^x' title='g(x)=e^x' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=h%28x%29%3D-2%5Csin%28x%2B%5Cpi%2F3%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h(x)=-2&#92;sin(x+&#92;pi/3)' title='h(x)=-2&#92;sin(x+&#92;pi/3)' class='latex' />. If someone asked, &#8220;Yes, but what <em>are</em> <img src='http://s0.wp.com/latex.php?latex=f%2Cg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f,g' title='f,g' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' />?&#8221; then an appropriate response might be, &#8220;Well, <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> is the result of doing something to <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> that turns it into another real number.&#8221;</p>
<p>Notice that in that last sentence I slightly avoided the issue. I didn&#8217;t say what <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> itself was &#8212; I just said what <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> was. (It&#8217;s another real number.) Notice also that for a <em>specific</em> function we don&#8217;t feel quite as tempted to ask what it really is: we don&#8217;t say, &#8220;What <em>is</em> squared?&#8221; That&#8217;s because &#8220;squared&#8221; isn&#8217;t a noun, so it feels wrong to imagine that it must be a thing, just as you would expect strange looks if you went around asking, &#8220;What <em>is</em> and?&#8221; (That&#8217;s not to say that an answer isn&#8217;t possible: one could say that AND is a logical connective, and a logical connective is something that joins two statements to form a new statement, and it could, if you wanted, be regarded as a function from the set of all pairs of statements to the set of all statements, etc. etc.)</p>
<p>What really matters about a function is not so much its essence as the following fact.</p>
<li>If <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a function from <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />.</li>
<p>Given a function <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' />, we can define a natural notion of the <em>graph</em> of that function. It is the set of all points <img src='http://s0.wp.com/latex.php?latex=%28x%2Cf%28x%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,f(x))' title='(x,f(x))' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' />. To put it another way, it is the set of all points <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29%5Cin+A%5Ctimes+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)&#92;in A&#92;times B' title='(x,y)&#92;in A&#92;times B' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=y%3Df%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=f(x)' title='y=f(x)' class='latex' />. This set has the property (discussed in the previous post) that for every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> there is exactly one <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)' title='(x,y)' class='latex' /> belongs to the graph of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' />. </p>
<p>And now the officially correct thing to do is to turn everything on its head and make the following definition.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> be sets. A <em>function from</em> <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> <em>to</em> <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> is a subset <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=A%5Ctimes+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;times B' title='A&#92;times B' class='latex' /> such that for every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> there is exactly one <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29%5Cin+f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)&#92;in f' title='(x,y)&#92;in f' class='latex' />.
</li>
<p>Then one goes on to say that it is traditional to write <img src='http://s0.wp.com/latex.php?latex=y%3Df%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=f(x)' title='y=f(x)' class='latex' /> instead of <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29%5Cin+f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)&#92;in f' title='(x,y)&#92;in f' class='latex' />. Equivalently, we define <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> to be the unique <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29%5Cin+f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)&#92;in f' title='(x,y)&#92;in f' class='latex' />. </p>
<p>Why does anyone bother with this strange definition? One reason is that we sometimes want to talk about the set of <em>all</em> functions from <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />, or, even more commonly, the set of all functions from <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> that satisfy certain conditions. It&#8217;s one thing to be able to recognise a function when you see one, but how do you say what counts as a function? We somehow want to capture the idea that <em>any</em> way of associating with each <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> some <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> counts as a function, even if that &#8220;way of associating&#8221; isn&#8217;t given by a rule of any kind. </p>
<p>It may seem as though the &#8220;take any old subset of <img src='http://s0.wp.com/latex.php?latex=A%5Ctimes+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;times B' title='A&#92;times B' class='latex' /> as long as for each <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> there&#8217;s exactly one <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)' title='(x,y)' class='latex' /> belongs to that subset&#8221; is a pretty neat way of capturing this arbitrariness. And in a sense it is. I think psychologically we are happier with the idea of a completely arbitrary set than we are with the idea of a completely arbitrary &#8220;way of associating&#8221; elements of one set with elements of another. But just because we are happier with it, that doesn&#8217;t mean we <em>should</em> be happier with it. What is an &#8220;arbitrary&#8221; set of integers, say? Sets of integers defined by properties such as &#8220;the set of all <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is a prime greater than 1000&#8243; are fine, but how can we capture the idea of a &#8220;completely arbitrary&#8221; set of integers that doesn&#8217;t have a definition of that kind? It&#8217;s more or less the same problem as that of capturing the idea of a completely arbitrary function that isn&#8217;t given by a rule. So, I respectfully submit, the subset-of-Cartesian-product definition of functions achieves virtually nothing.</p>
<p>That will probably provoke a lot of disagreement, so let me qualify it slightly. It is useful for some purposes to &#8220;reduce everything to sets&#8221;. If I am working on the foundations of mathematics and I show that every statement to do with functions can be translated into an equivalent statement to do with sets, then I have shown that if I can sort sets out then I don&#8217;t have to do any further work to sort out functions as well. But in what one might call &#8220;everyday mathematics&#8221; I think that the definition of functions in terms of subsets of Cartesian products is of no use whatsoever. </p>
<p>Hmm, I&#8217;m worried that I&#8217;m still exaggerating. Let me consider a statement that might make it seem important to answer the question, &#8220;What <em>is</em> a function?&#8221; It is the following.</p>
<li>There are uncountably many functions from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' />to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' />.</li>
<p>You haven&#8217;t yet been told what this means, but for now just think of &#8220;uncountably many&#8221; as meaning &#8220;not just infinitely many but an extra-specially big infinitely many&#8221; or something like that. We obviously can&#8217;t prove a statement like that by listing a whole bunch of functions, so doesn&#8217;t that force us to have some general idea of what a function <em>is</em>? </p>
<p>Here are two arguments that it doesn&#8217;t. One is that I can prove this by reducing it to another problem. For each positive real number <img src='http://s0.wp.com/latex.php?latex=%5Calpha&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;alpha' title='&#92;alpha' class='latex' /> I can define <img src='http://s0.wp.com/latex.php?latex=f%28n%29%3D%5Clceil%5Calpha+n%5Crceil&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(n)=&#92;lceil&#92;alpha n&#92;rceil' title='f(n)=&#92;lceil&#92;alpha n&#92;rceil' class='latex' /> (this means the smallest integer greater than <img src='http://s0.wp.com/latex.php?latex=%5Calpha+n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;alpha n' title='&#92;alpha n' class='latex' />). It&#8217;s easy to check that these are all different functions. And then we can appeal to the fact that there are uncountably many positive real numbers. </p>
<p>The second argument is closely related. It is also true that there are uncountably many subsets of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' />, but nobody feels that we have to say what a subset of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' /> <em>is</em> in order to make sense of this statement. We just need to know a few rules for dealing with sets: in particular, for defining new sets out of old ones.</p>
<p>Just before I move on, let me express particular distaste for any definition that begins, &#8220;A function from <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> is a relation such that &#8230;&#8221; I absolutely hate this. The reason I hate it is that functions and relations are, to any reasonable person, <em>different kinds of things</em>, except that I don&#8217;t want to call them things at all, so what I really mean is that they have a different <em>grammar</em>.</p>
<p>To illustrate what I mean by grammar, it&#8217;s rules like this.</p>
<p>1. If <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> denotes an element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> denotes an element of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />.</p>
<p>2. If <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q' title='Q' class='latex' /> denote statements, then <img src='http://s0.wp.com/latex.php?latex=P%5Cvee+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;vee Q' title='P&#92;vee Q' class='latex' /> denotes a statement.</p>
<p>3. If <img src='http://s0.wp.com/latex.php?latex=%5Csim&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sim' title='&#92;sim' class='latex' /> is a relation on <img src='http://s0.wp.com/latex.php?latex=A%5Ctimes+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;times B' title='A&#92;times B' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />, and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=x%5Csim+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;sim y' title='x&#92;sim y' class='latex' /> is a statement.</p>
<p>4. If <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> is a property defined on a set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X' title='x&#92;in X' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(x)' title='P(x)' class='latex' /> is a statement.</p>
<p>5. If <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> is a property defined on a set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=%5C%7Bx%5Cin+X%3AP%28x%29%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{x&#92;in X:P(x)&#92;}' title='&#92;{x&#92;in X:P(x)&#92;}' class='latex' /> is a subset of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />.</p>
<p>I&#8217;ll talk more about this kind of thing in a later post, but I hope these examples give you the idea. And 1 and 3 demonstrate that the grammar of functions is not the same as the grammar of relations. The fact that you can turn them into equivalent concepts that do have the same grammar is neither here nor there. You can do that with nouns and adjectives too. For example, I could decide that from now on I&#8217;m going to say, &#8220;There&#8217;s a red&#8221; where I used to say, &#8220;That&#8217;s red.&#8221; If I wanted to say, &#8220;My car is green,&#8221; I would say, &#8220;My car is a green,&#8221; just as I might say, &#8220;That dog is a Rottweiler.&#8221; It would be possible (basically, nouns and adjectives are both ways of picking out some subset of the set of all possible objects in the world) &#8212; but it would also be a bit weird.</p>
<p>But none of this what I really wanted to talk about. Rather, I wanted to discuss a few confusing bits of terminology. </p>
<p>To begin with, what are domains, ranges and codomains? What&#8217;s confusing about this is that there isn&#8217;t a standard terminology. The one I was taught as an undergraduate was this. If <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> is a function, then <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is called the domain of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> is called the range of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' />. That&#8217;s the sense in which I&#8217;ve been using the words in these posts and I hope it agrees with what your lecturers have said.</p>
<p>However, some people use the word &#8220;codomain&#8221; for <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> instead of &#8220;range&#8221;. Worse still (from the point of view of communication between mathematicians), people who call <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> the codomain often use the word &#8220;range&#8221; to refer to the set of values taken by <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' />: that is, to the set <img src='http://s0.wp.com/latex.php?latex=%5C%7Bf%28x%29%3Ax%5Cin+A%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{f(x):x&#92;in A&#92;}' title='&#92;{f(x):x&#92;in A&#92;}' class='latex' />. I call that the <em>image</em> of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' />, and I think that is probably the Cambridge standard.</p>
<p>To give an example, let&#8217;s take the function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> defined by <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dx%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=x^2' title='f(x)=x^2' class='latex' />. In my terminology, the domain and range are both <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> and the image is the set of non-negative real numbers. I&#8217;m also happy to call <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> the codomain if you want &#8212; for me, &#8220;codomain&#8221; and &#8220;range&#8221; mean the same thing. </p>
<p>However, some people would say that the domain and codomain are <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> while the range is the set of non-negative reals. I think they would regard &#8220;range&#8221; as synonymous with &#8220;image&#8221;. </p>
<p>So I suppose we would all get along fine if we just abolished the word &#8220;range&#8221;. </p>
<p>Unfortunately, that isn&#8217;t the end of the confusion, since some people use the word &#8220;domain&#8221; to mean &#8220;set of points where <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> makes sense&#8221;. To give an example, they might say this: &#8220;Let us define <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3D1%2Fx%28x%2B2%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=1/x(x+2)' title='f(x)=1/x(x+2)' class='latex' />. Then the domain of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is the set of all real numbers apart from <img src='http://s0.wp.com/latex.php?latex=0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='0' title='0' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=-2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='-2' title='-2' class='latex' />.&#8221; Apparently, this use of the word &#8220;domain&#8221; is quite common in schools. According to my terminology, the function just defined was not a function from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> at all. Why not? Because it doesn&#8217;t assign a real number to 0 or to -2. </p>
<p>So if you want to be safe, then given a function <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> you can call <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> the domain, <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> the codomain, and <img src='http://s0.wp.com/latex.php?latex=%5C%7Bf%28x%29%3Ax%5Cin+A%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{f(x):x&#92;in A&#92;}' title='&#92;{f(x):x&#92;in A&#92;}' class='latex' /> the image.</p>
<p>That is still not the end of the potential confusion. I said earlier that the one thing you need to know about a function <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> is that if <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />. In a nice friendly world, you could deduce that whenever you see <img src='http://s0.wp.com/latex.php?latex=f%28%2A%2A%2A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(***)' title='f(***)' class='latex' />, then whatever <img src='http://s0.wp.com/latex.php?latex=%2A%2A%2A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='***' title='***' class='latex' /> is must be an element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />. Unfortunately that&#8217;s not the case in the world we actually inhabit: we often write <img src='http://s0.wp.com/latex.php?latex=f%28C%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(C)' title='f(C)' class='latex' /> when <img src='http://s0.wp.com/latex.php?latex=C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C' title='C' class='latex' /> is a <em>subset</em> of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />. </p>
<p>Now in a sense that&#8217;s just plain incorrect. Functions from <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> are little machines that turn <em>elements</em> of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> into <em>elements</em> of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />. So how can we write <img src='http://s0.wp.com/latex.php?latex=f%28C%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(C)' title='f(C)' class='latex' /> if <img src='http://s0.wp.com/latex.php?latex=C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C' title='C' class='latex' /> is a <em>subset</em> of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />? The answer is that the wrongness of doing that tells us one of two things:</p>
<p>(i) the writer has made a mistake;</p>
<p>(ii) the writer <em>means something different</em>.</p>
<p>You should have enough confidence in your lecturers and textbooks to assume that it is (ii) that holds and not (i). So what does <img src='http://s0.wp.com/latex.php?latex=f%28C%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(C)' title='f(C)' class='latex' /> mean? It means the set of all <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=x%5Cin+C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in C' title='x&#92;in C' class='latex' />, or in symbols <img src='http://s0.wp.com/latex.php?latex=%5C%7Bf%28x%29%3Ax%5Cin+C%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{f(x):x&#92;in C&#92;}' title='&#92;{f(x):x&#92;in C&#92;}' class='latex' />. For example, if <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is the function from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' /> that takes <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=n%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n^2' title='n^2' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> is the set of all even numbers, then <img src='http://s0.wp.com/latex.php?latex=f%28E%29%3D%5C%7B4%2C16%2C36%2C64%2C100%2C144%2C...%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(E)=&#92;{4,16,36,64,100,144,...&#92;}' title='f(E)=&#92;{4,16,36,64,100,144,...&#92;}' class='latex' />. The set <img src='http://s0.wp.com/latex.php?latex=f%28C%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(C)' title='f(C)' class='latex' /> is a subset of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> and it is called the image of <img src='http://s0.wp.com/latex.php?latex=C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C' title='C' class='latex' />. </p>
<p>The alarm bells should be ringing again. Earlier, I defined the image of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> to be the set <img src='http://s0.wp.com/latex.php?latex=%5C%7Bf%28x%29%3Ax%5Cin+A%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{f(x):x&#92;in A&#92;}' title='&#92;{f(x):x&#92;in A&#92;}' class='latex' />, which we now see that we can write as <img src='http://s0.wp.com/latex.php?latex=f%28A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(A)' title='f(A)' class='latex' />. So is the set <img src='http://s0.wp.com/latex.php?latex=f%28A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(A)' title='f(A)' class='latex' /> the image of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> (as I said earlier) or the image of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> (as would be consistent with the more recent definition)? You just have to be alert to the context. Functions have images, but if you&#8217;re talking about a given function that&#8217;s clear from the context, then you also talk about images of subsets. Oh, and while we&#8217;re at it, if <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> is called the image of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />. </p>
<p>There is nothing for it but to get used to the fact that the same words and notation can be used for concepts with different &#8212; and confusable &#8212; meanings. If you see <img src='http://s0.wp.com/latex.php?latex=f%28%2A%2A%2A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(***)' title='f(***)' class='latex' />, then one of the first things you should do is look at what&#8217;s in those brackets and ask yourself what kind of object it is. If it&#8217;s an element of the domain (which will usually be indicated by a lower-case letter, which helps to reduce the confusion), then what you&#8217;ve got is an element of the codomain. If it&#8217;s a subset of the domain, then what you&#8217;ve got is a subset of the codomain. </p>
<p>Let me give a different example of this kind of use of &#8220;element notation applied to subsets&#8221;. If <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> are sets of integers, we sometimes write <img src='http://s0.wp.com/latex.php?latex=A%2BB&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A+B' title='A+B' class='latex' />. You might object that this cannot be correct, on the grounds that <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> are <em>sets</em> and you don&#8217;t add sets together &#8212; you take things like unions and intersections. And that objection is right in the following sense: when we write <img src='http://s0.wp.com/latex.php?latex=A%2BB&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A+B' title='A+B' class='latex' /> <em>we are not giving the usual meaning to the plus symbol</em>. So what do we mean? We actually mean something fairly natural, which is the set of all numbers you can make by adding something in <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> to something in <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />. In symbols, <img src='http://s0.wp.com/latex.php?latex=A%2BB%3D%5C%7Bx%2By%3Ax%5Cin+A%2C+y%5Cin+B%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A+B=&#92;{x+y:x&#92;in A, y&#92;in B&#92;}' title='A+B=&#92;{x+y:x&#92;in A, y&#92;in B&#92;}' class='latex' />.</p>
<p>A quick example: if <img src='http://s0.wp.com/latex.php?latex=A%3D%5C%7B1%2C3%2C5%2C10%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A=&#92;{1,3,5,10&#92;}' title='A=&#92;{1,3,5,10&#92;}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B%3D%5C%7B1%2C4%2C7%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B=&#92;{1,4,7&#92;}' title='B=&#92;{1,4,7&#92;}' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=A%2BB%3D%5C%7B2%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C14%2C17%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A+B=&#92;{2,4,5,6,7,8,9,10,11,12,14,17&#92;}' title='A+B=&#92;{2,4,5,6,7,8,9,10,11,12,14,17&#92;}' class='latex' />. Here, I&#8217;m not really adding sets: I&#8217;m adding the elements and forming a set out of all possible results. In a similar way, when I take the set <img src='http://s0.wp.com/latex.php?latex=f%28A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(A)' title='f(A)' class='latex' />, I&#8217;m not applying the function <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> to the set <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />: I&#8217;m applying <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> to the elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and forming a set out of all possible results. It&#8217;s a very important distinction.</p>
<p>I&#8217;ve said that if <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> is called the image of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />. Something similar happens the other way round. If <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=y' title='f(x)=y' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is called a preimage of <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />. Note a very important distinction between these two definitions: I talked about <em>the</em> image but <em>a</em> preimage. That&#8217;s because the definition of a function requires there to be exactly one image for each element <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' />, but if I pick <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> it might not have any preimages, and it might have more than one preimage.</p>
<p>Finally, a rare situation where we don&#8217;t use the same word twice &#8212; however, we make up for this big time in our choice of symbolic notation. If we have a subset <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />, then the <em>inverse image</em> of <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' />, denoted <img src='http://s0.wp.com/latex.php?latex=f%5E%7B-1%7D%28D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f^{-1}(D)' title='f^{-1}(D)' class='latex' />, is defined to be the set of all preimages of elements of <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' />. Equivalently, it is the set of all <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cin+D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;in D' title='f(x)&#92;in D' class='latex' />. Equivalently again, it is the set <img src='http://s0.wp.com/latex.php?latex=%5C%7Bx%5Cin+A%3Af%28x%29%5Cin+D%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{x&#92;in A:f(x)&#92;in D&#92;}' title='&#92;{x&#92;in A:f(x)&#92;in D&#92;}' class='latex' />. </p>
<p>Here&#8217;s a quick example. Let <img src='http://s0.wp.com/latex.php?latex=A%3D%5C%7B1%2C2%2C3%2C4%2C5%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A=&#92;{1,2,3,4,5&#92;}' title='A=&#92;{1,2,3,4,5&#92;}' class='latex' />, let <img src='http://s0.wp.com/latex.php?latex=B%3D%5C%7B1%2C2%2C3%2C4%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B=&#92;{1,2,3,4&#92;}' title='B=&#92;{1,2,3,4&#92;}' class='latex' /> and define <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> as follows: <img src='http://s0.wp.com/latex.php?latex=f%281%29%3Df%282%29%3D1%2C+f%283%29%3D2%2C+f%284%29%3Df%285%29%3D3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(1)=f(2)=1, f(3)=2, f(4)=f(5)=3' title='f(1)=f(2)=1, f(3)=2, f(4)=f(5)=3' class='latex' />. Then the inverse image of the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C4%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,4&#92;}' title='&#92;{1,4&#92;}' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C2%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,2&#92;}' title='&#92;{1,2&#92;}' class='latex' />. Why? Because 1 and 2 are the elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> that have images that belong to the set <img src='http://s0.wp.com/latex.php?latex=%5C%7B1%2C4%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{1,4&#92;}' title='&#92;{1,4&#92;}' class='latex' />. </p>
<p>I hope you will have noticed something very important about that example, which is that the function <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> in question <em>does not have an inverse</em>. In fact, it doesn&#8217;t even come close: the fact that <img src='http://s0.wp.com/latex.php?latex=f%281%29%3Df%282%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(1)=f(2)' title='f(1)=f(2)' class='latex' /> shows that it isn&#8217;t an injection (which means that if we tried to form an inverse <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> we wouldn&#8217;t be able to decide between setting <img src='http://s0.wp.com/latex.php?latex=g%281%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(1)=1' title='g(1)=1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g%281%29%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(1)=2' title='g(1)=2' class='latex' />) and the fact that 4 has no preimage shows that it isn&#8217;t a surjection either (we would have no idea what value to give to <img src='http://s0.wp.com/latex.php?latex=g%284%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(4)' title='g(4)' class='latex' />). And yet, I happily wrote <img src='http://s0.wp.com/latex.php?latex=f%5E%7B-1%7D%28%5C%7B1%2C4%5C%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f^{-1}(&#92;{1,4&#92;})' title='f^{-1}(&#92;{1,4&#92;})' class='latex' />. (Actually, I didn&#8217;t write it, but I&#8217;m writing it now. And I&#8217;m happy.)</p>
<p>This is a very frequent source of confusion. Generation after generation of Cambridge undergraduates see an expression like <img src='http://s0.wp.com/latex.php?latex=f%5E%7B-1%7D%28D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f^{-1}(D)' title='f^{-1}(D)' class='latex' /> and conclude, wrongly but not entirely unreasonably, that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> has an inverse. Indeed, it looks as though it must mean the image of <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> under the inverse function of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' />. But it doesn&#8217;t (except that if <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> does happen to have an inverse, then <img src='http://s0.wp.com/latex.php?latex=f%5E%7B-1%7D%28D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f^{-1}(D)' title='f^{-1}(D)' class='latex' /> does happen to be the image of <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> under that inverse). </p>
<p>The best I can do to help you with understanding inverse images of sets is this. If you ever see a sentence of the form <img src='http://s0.wp.com/latex.php?latex=x%5Cin+f%5E%7B-1%7D%28D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in f^{-1}(D)' title='x&#92;in f^{-1}(D)' class='latex' />, then you are at liberty to translate it into the equivalent but more transparent sentence <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cin+D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;in D' title='f(x)&#92;in D' class='latex' />. What&#8217;s more, I recommend doing so.</p>
<p>Suppose, for example, that you are asked to prove the following simple fact. (At least, it&#8217;s very simple once you are used to the definitions and to standard techniques for writing proofs.)</p>
<p><strong>Fact.</strong> Let <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> be sets and let <img src='http://s0.wp.com/latex.php?latex=f%3AX%5Cto+Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:X&#92;to Y' title='f:X&#92;to Y' class='latex' />. Let <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> be a subset of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> be a subset of <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' />. Prove that <img src='http://s0.wp.com/latex.php?latex=f%28A%29%5Csubset+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(A)&#92;subset B' title='f(A)&#92;subset B' class='latex' /> if and only if <img src='http://s0.wp.com/latex.php?latex=A%5Csubset+f%5E%7B-1%7D%28B%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;subset f^{-1}(B)' title='A&#92;subset f^{-1}(B)' class='latex' />.</p>
<p>If you&#8217;re asked to prove an if and only if, then you start by assuming one side and deducing the other, and then you prove the implication in the opposite direction. So let&#8217;s begin by assuming that <img src='http://s0.wp.com/latex.php?latex=f%28A%29%5Csubset+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(A)&#92;subset B' title='f(A)&#92;subset B' class='latex' />. What do we need to prove? We want to show that <img src='http://s0.wp.com/latex.php?latex=A%5Csubset+f%5E%7B-1%7D%28B%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;subset f^{-1}(B)' title='A&#92;subset f^{-1}(B)' class='latex' />. How do we show something like that? If you&#8217;ve learnt the definition of &#8220;is a subset of&#8221;, then you will call up to the front of your brain the following statement as what we want to prove.</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=x%5Cin+f%5E%7B-1%7D%28B%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in f^{-1}(B)' title='x&#92;in f^{-1}(B)' class='latex' />.</li>
<p>And if you have taken on board advice in the post on injections and surjections, you will now immediately write &#8220;Let <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' />.&#8221; The task is now to prove that <img src='http://s0.wp.com/latex.php?latex=x%5Cin+f%5E%7B-1%7D%28B%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in f^{-1}(B)' title='x&#92;in f^{-1}(B)' class='latex' />.</p>
<p>Aha! That is a sentence of exactly the form that allows us to get rid of that nasty and confusing <img src='http://s0.wp.com/latex.php?latex=f%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f^{-1}' title='f^{-1}' class='latex' />, since it is equivalent to the statement <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;in B' title='f(x)&#92;in B' class='latex' />. So we know that <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> and we want to prove that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;in B' title='f(x)&#92;in B' class='latex' />. What were we given? Oh yes, that <img src='http://s0.wp.com/latex.php?latex=f%28A%29%5Csubset+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(A)&#92;subset B' title='f(A)&#92;subset B' class='latex' />. But <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' />, so <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cin+f%28A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;in f(A)' title='f(x)&#92;in f(A)' class='latex' />. But if <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cin+f%28A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;in f(A)' title='f(x)&#92;in f(A)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=f%28A%29%5Csubset+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(A)&#92;subset B' title='f(A)&#92;subset B' class='latex' />, it follows (directly from the definition of &#8220;is a subset of&#8221;) that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;in B' title='f(x)&#92;in B' class='latex' />, just as we wanted.</p>
<p>How about the other direction? This time we assume that <img src='http://s0.wp.com/latex.php?latex=A%5Csubset+f%5E%7B-1%7D%28B%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;subset f^{-1}(B)' title='A&#92;subset f^{-1}(B)' class='latex' /> and we want to prove that <img src='http://s0.wp.com/latex.php?latex=f%28A%29%5Csubset+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(A)&#92;subset B' title='f(A)&#92;subset B' class='latex' />. So we want to prove that every element of <img src='http://s0.wp.com/latex.php?latex=f%28A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(A)' title='f(A)' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />. But every element of <img src='http://s0.wp.com/latex.php?latex=f%28A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(A)' title='f(A)' class='latex' /> is of the form <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> for some <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' />, so we can begin with, &#8220;Let <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' />,&#8221; and know that our target is to prove that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;in B' title='f(x)&#92;in B' class='latex' />. (This is a slight elaboration of the &#8220;let&#8221; trick that is convenient for dealing with a situation where what we are sort of saying is, &#8220;For every <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' />, such-and-such happens.&#8221;) We know that <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' />, and that and the hypothesis that <img src='http://s0.wp.com/latex.php?latex=A%5Csubset+f%5E%7B-1%7D%28B%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;subset f^{-1}(B)' title='A&#92;subset f^{-1}(B)' class='latex' /> tell us that <img src='http://s0.wp.com/latex.php?latex=x%5Cin+f%5E%7B-1%7D%28B%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in f^{-1}(B)' title='x&#92;in f^{-1}(B)' class='latex' />.</p>
<p>Aha! We can get rid of that nasty and confusing <img src='http://s0.wp.com/latex.php?latex=f%5E%7B-1%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f^{-1}' title='f^{-1}' class='latex' />: we now know that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;in B' title='f(x)&#92;in B' class='latex' />. Better still, that&#8217;s exactly what we were trying to prove.</p>
<p>Just in case I haven&#8217;t made it sufficiently clear, if <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=y' title='f(x)=y' class='latex' />, then it is <em>incorrect</em> to say that <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is an inverse image of <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> (it is a preimage) and it is <em>incorrect</em> to write <img src='http://s0.wp.com/latex.php?latex=x%3Df%5E%7B-1%7D%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=f^{-1}(y)' title='x=f^{-1}(y)' class='latex' /> (the function <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> might not have an inverse, and if it doesn&#8217;t, then <img src='http://s0.wp.com/latex.php?latex=f%5E%7B-1%7D%28%2A%2A%2A%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f^{-1}(***)' title='f^{-1}(***)' class='latex' /> makes sense only if <img src='http://s0.wp.com/latex.php?latex=%2A%2A%2A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='***' title='***' class='latex' /> is a subset of the codomain of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' />). Similarly, if <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> is a subset of <img src='http://s0.wp.com/latex.php?latex=C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C' title='C' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=f%5E%7B-1%7D%28D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f^{-1}(D)' title='f^{-1}(D)' class='latex' /> is <em>not</em> a preimage, or even the preimage, of <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> (it is the inverse image). And the fact that we write <img src='http://s0.wp.com/latex.php?latex=f%5E%7B-1%7D%28D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f^{-1}(D)' title='f^{-1}(D)' class='latex' /> does <em>not</em> mean that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> has an inverse. </p>
<p>It only remains for me to apologize on behalf of the mathematical community for the historical accidents that have led to this jumble of overlapping terminology and notation. You have no choice but to learn it and be very careful about using it. The one thing I would say is that one gets used to it &#8212; so much so that it becomes hard to remember what it was like to find it confusing. </p>
<p>That reminds me that I haven&#8217;t finished discussing non-standardness of terminology to do with functions. You will often see the following alternative terminology for injections, surjections and bijections. Injections are called one-to-one functions, surjections are called onto functions, and bijections are called one-to-one correspondences. This terminology is pretty confusing &#8211;see <a href="http://gowers.wordpress.com/2011/10/11/injections-surjections-and-all-that/#comment-12323">Terence Tao&#8217;s comment on one-to-one functions</a> &#8212; but you probably have to learn it too.</p>
<p>I&#8217;m mentioning this here because I want to recommend another of <a href="http://scherk.pbworks.com/w/page/14864181/FrontPage">the quizzes</a>, the one on functions. If you aren&#8217;t quite sure whether you have understood the material in this post and the previous one (and what your lecturers have said on similar topics), then trying out that quiz will soon tell you how you are doing. But it uses &#8220;one-to one function&#8221;, &#8220;onto function&#8221; and &#8220;one-to-one correspondence&#8221; instead of &#8220;injection&#8221;, &#8220;surjection&#8221; and &#8220;bijection&#8221;, so you&#8217;ll need to be ready to use that terminology.</p>
<hr />
<p><strong>Added later.</strong> I received an email from Imre Leader expressing disagreement with my assertion that the set-theoretic definition of functions achieves nothing. Since he had some valid points to make, and since we ended up agreeing with each other completely (I think), I&#8217;d like to report on our exchange.</p>
<p>Imre&#8217;s point is that, as I said above, people just <em>are</em> more comfortable with the idea of an arbitrary set not defined by a nice property than they are with the idea of an arbitrary function not defined by a nice rule. I maintain that this is irrational, but even if it is, I can&#8217;t deny that it is true. So if you give the set-theoretic definition of functions, then you make completely clear that functions can be just as arbitrary as sets.</p>
<p>My eventual response (after a certain amount of thought, and an email that Imre disagreed with in a number of places) was that I agreed that the set-theoretic definition of functions helps one to understand how arbitrary functions can be, but that that benefit can be achieved in a different and better way. Instead of saying that a function from <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> <em>is</em> a subset <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=A%5Ctimes+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;times B' title='A&#92;times B' class='latex' /> such that for every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> there is exactly one <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29%5Cin+f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)&#92;in f' title='(x,y)&#92;in f' class='latex' />, we should say that every function <em>can be obtained</em> from such a set. It turns out that Imre is entirely happy with that idea. (Also, he was at pains to stress that he is no fonder of turning everything into sets than I am.)</p>
<p>Here in detail is what I would say. Let <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> (to stand for &#8220;graph&#8221;) be a subset of <img src='http://s0.wp.com/latex.php?latex=A%5Ctimes+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A&#92;times B' title='A&#92;times B' class='latex' /> such that for every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> there is exactly one <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)&#92;in G' title='(x,y)&#92;in G' class='latex' />. Then we can define a function <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> in terms of <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> by letting <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> be the unique <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29%5Cin+G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)&#92;in G' title='(x,y)&#92;in G' class='latex' />. Moreover, every function from <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> can be obtained this way, since if <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is such a function we can define <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> to be the set of all <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)' title='(x,y)' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=y%3Df%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=f(x)' title='y=f(x)' class='latex' />. </p>
<p>What I like about this approach is that it doesn&#8217;t feel unnecessarily paradoxical. I&#8217;m not saying, &#8220;Actually a function isn&#8217;t a function at all &#8212; it&#8217;s a funny kind of set.&#8221; Rather, I&#8217;m saying that there&#8217;s a one-to-one correspondence between functions and funny kinds of sets. This captures the arbitrariness of functions (if you believe that sets can be very arbitrary, then you can carry this arbitrariness over to functions), but it also preserves their function-like nature (given a set <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> with certain properties, I then tell you a rule for associating elements of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> with elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />). </p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3520/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3520/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3520/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3520/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3520/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3520/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3520/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3520/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3520/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3520/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3520/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3520/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3520/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3520/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3520&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/10/13/domains-codomains-ranges-images-preimages-inverse-images/feed/</wfw:commentRss>
		<slash:comments>27</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Injections, surjections and all that</title>
		<link>http://gowers.wordpress.com/2011/10/11/injections-surjections-and-all-that/</link>
		<comments>http://gowers.wordpress.com/2011/10/11/injections-surjections-and-all-that/#comments</comments>
		<pubDate>Tue, 11 Oct 2011 10:36:13 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Cambridge teaching]]></category>
		<category><![CDATA[General concepts]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3480</guid>
		<description><![CDATA[My spies tell me that the second lecture of the Group Theory course contained a discussion of functions, and in particular bijections &#8212; for which it was necessary to prove a few results about injections and surjections. Since a good understanding of functions is essential throughout mathematics, perhaps a post on the topic would be [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3480&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>My spies tell me that the second lecture of the Group Theory course contained a discussion of functions, and in particular bijections &#8212; for which it was necessary to prove a few results about injections and surjections. Since a good understanding of functions is essential throughout mathematics, perhaps a post on the topic would be in order. My aim in these posts is not to cover the material all over again, but rather to stress the points that you need to understand in order to be able to write proofs. So let me give a few thoughts about functions, in no particular order.<br />
<span id="more-3480"></span></p>
<p>1. <em>The range and domain matter.</em></p>
<p>Two functions are considered equal if they have the same domain and the same range, and if they do the same thing to everything in the domain. In other words, there is more to a function than what it does. For example, let&#8217;s write <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> for the set of real numbers and <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D_%2B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}_+' title='&#92;mathbb{R}_+' class='latex' /> for the set of non-negative real numbers, and let&#8217;s define two functions as follows.</p>
<li><img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dx%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=x^2' title='f(x)=x^2' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=x%5Cin%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in&#92;mathbb{R}' title='x&#92;in&#92;mathbb{R}' class='latex' />.</li>
<li><img src='http://s0.wp.com/latex.php?latex=g%3A%5Cmathbb%7BR%7D_%2B%5Cto%5Cmathbb%7BR%7D_%2B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g:&#92;mathbb{R}_+&#92;to&#92;mathbb{R}_+' title='g:&#92;mathbb{R}_+&#92;to&#92;mathbb{R}_+' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g%28x%29%3Dx%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(x)=x^2' title='g(x)=x^2' class='latex' /> for every <img src='http://s0.wp.com/latex.php?latex=x%5Cin%5Cmathbb%7BR%7D_%2B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in&#92;mathbb{R}_+' title='x&#92;in&#92;mathbb{R}_+' class='latex' />.</li>
<p>In a certain sense, these two functions are defined by the same mathematical process &#8212; that of taking a real number and squaring it. But the process is not a function. They are different functions because they have different domains and ranges. </p>
<p>Why do we insist on this point? One pretty convincing reason is that if we didn&#8217;t, then we wouldn&#8217;t be able to talk about injections, surjections and bijections. For example, the function <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> I have just defined is a bijection, whereas the function <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> that I defined before it was neither an injection nor a surjection.</p>
<p>2. <em>Learn the definition of injection. Don&#8217;t try to reconstruct it for yourself.</em></p>
<p>I&#8217;ve lost count of the number of times I have seen attempted explanations like this for what it means to say that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection.</p>
<p>(i) <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> is always unique.</p>
<p>(ii) <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> takes exactly one value for every <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />.</p>
<p>(iii) If <img src='http://s0.wp.com/latex.php?latex=x%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=y' title='x=y' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(y)' title='f(x)=f(y)' class='latex' />.</p>
<p>Those three attempted definitions are all completely wrong. Here are four correct ones. For all three I&#8217;ll assume that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a function with domain <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and range <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />.</p>
<p>(i) Every <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> has at most one preimage. [A <em>preimage</em> of <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> means an element <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=y' title='f(x)=y' class='latex' />.]</p>
<p>(ii) For every <img src='http://s0.wp.com/latex.php?latex=x%2Cx%27%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x,x&#039;&#92;in A' title='x,x&#039;&#92;in A' class='latex' />, if <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(x&#039;)' title='f(x)=f(x&#039;)' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=x%3Dx%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=x&#039;' title='x=x&#039;' class='latex' />.</p>
<p>(iii) For each <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> there is at most one <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=y' title='f(x)=y' class='latex' />.</p>
<p>(iv) For every <img src='http://s0.wp.com/latex.php?latex=x%2Cx%27%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x,x&#039;&#92;in A' title='x,x&#039;&#92;in A' class='latex' />, if <img src='http://s0.wp.com/latex.php?latex=x%5Cne+x%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;ne x&#039;' title='x&#92;ne x&#039;' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cne+f%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;ne f(x&#039;)' title='f(x)&#92;ne f(x&#039;)' class='latex' />.</p>
<p>Of these, I recommend avoiding (iii). It&#8217;s correct, but it seems to be the definition that leads to the kind of nonsense of the first three definitions if you misremember it. Note that (ii) and (iv) are closely related: to convert (ii) into (iv) I simply replaced the statement</p>
<li><img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28x%27%29%5Cimplies+x%3Dx%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(x&#039;)&#92;implies x=x&#039;' title='f(x)=f(x&#039;)&#92;implies x=x&#039;' class='latex' /></li>
<p>inside the quantifier by its contrapositive</p>
<li><img src='http://s0.wp.com/latex.php?latex=x%5Cne+x%27%5Cimplies+f%28x%29%5Cne+f%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;ne x&#039;&#92;implies f(x)&#92;ne f(x&#039;)' title='x&#92;ne x&#039;&#92;implies f(x)&#92;ne f(x&#039;)' class='latex' />.</li>
<p>I recommend usually avoiding (iv) as well. Why? Because it is usually easier to work with the &#8220;positive&#8221; hypothesis that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(x&#039;)' title='f(x)=f(x&#039;)' class='latex' /> than it is to work with the &#8220;negative&#8221; hypothesis that <img src='http://s0.wp.com/latex.php?latex=x%5Cne+x%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;ne x&#039;' title='x&#92;ne x&#039;' class='latex' />. </p>
<p>3. <em>How to prove that a function is an injection.</em></p>
<p>If you are asked to show that a function <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> is an injection, then my advice is to use definition (ii) above, by which I of course mean the correct definition (ii) rather than the incorrect one. Here&#8217;s the definition again.</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x%2Cx%27%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x,x&#039;&#92;in A' title='x,x&#039;&#92;in A' class='latex' />, if <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(x&#039;)' title='f(x)=f(x&#039;)' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=x%3Dx%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=x&#039;' title='x=x&#039;' class='latex' />.</li>
<p>How do we use that? Well, it involves a two-step process that you should program into your brain until you do it without thinking. If you ever find yourself asked to prove that a function <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> is an injection, you first bring the definition above into the front of your mind. In order to be able to do that, you have to have learnt the definition first. A major theme of these posts will be that you can program yourself to do a lot of things automatically, but in order to do so you have to invest some time learning definitions and basic results. </p>
<p>Once you have recalled the definition, the second step is very easy indeed, and is something that I covered in <a href="http://gowers.wordpress.com/2011/10/07/basic-logic-tips-for-handling-variables/">the post about handling variables</a>. If you have to prove a statement of the form</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X%5C+%5C+P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X&#92; &#92; P(x)' title='x&#92;in X&#92; &#92; P(x)' class='latex' /></li>
<p>then you write down the words, &#8220;Let <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X' title='x&#92;in X' class='latex' />&#8220;. I call this the &#8220;let&#8221; trick. A great thing it does for you is remove one layer of quantification: if you find lots of quantifiers scary, then removing one can only be good news.</p>
<p>In most situations that crop up, the statement <img src='http://s0.wp.com/latex.php?latex=P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(x)' title='P(x)' class='latex' /> is itself of the form <img src='http://s0.wp.com/latex.php?latex=Q%28x%29%5Cimplies+R%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q(x)&#92;implies R(x)' title='Q(x)&#92;implies R(x)' class='latex' />, and then you can do a little more while still on autopilot. If you need to prove the statement</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X%5C+%5C+Q%28x%29%5Cimplies+R%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X&#92; &#92; Q(x)&#92;implies R(x)' title='x&#92;in X&#92; &#92; Q(x)&#92;implies R(x)' class='latex' /></li>
<p>then the first words you write should be</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X' title='x&#92;in X' class='latex' /> and suppose that <img src='http://s0.wp.com/latex.php?latex=Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q(x)' title='Q(x)' class='latex' />.</li>
<p>or perhaps</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> be an element of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q(x)' title='Q(x)' class='latex' />.</li>
<p>At any rate, you should <em>declare</em> the variable <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and then tell your reader that you are assuming <img src='http://s0.wp.com/latex.php?latex=Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q(x)' title='Q(x)' class='latex' />. </p>
<p>Let us see how that applies here. The statement we want to prove is as follows.</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x%2Cx%27%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x,x&#039;&#92;in A' title='x,x&#039;&#92;in A' class='latex' />, if <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(x&#039;)' title='f(x)=f(x&#039;)' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=x%3Dx%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=x&#039;' title='x=x&#039;' class='latex' />.</li>
<p>Applying the &#8220;let&#8221; trick, we begin as follows.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#039;' title='x&#039;' class='latex' /> be elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /></li>
<p>We then see that we want to show that one thing implies another. So we follow up with what one might call the &#8220;suppose&#8221; trick, and end up with this.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#039;' title='x&#039;' class='latex' /> be elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />, and suppose that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(x&#039;)' title='f(x)=f(x&#039;)' class='latex' />.</li>
<p>Up to this point, <em>no thought whatsoever has been required</em>. So if you have any difficulty before getting to this stage, it is an excellent example of what I called a &#8220;fake difficulty&#8221; in an earlier post. The only reasons you might have for finding it difficult are</p>
<p>(a) that you haven&#8217;t learnt the definition of an injection;</p>
<p>(b) that you haven&#8217;t programmed yourself to apply the &#8220;let&#8221; and &#8220;suppose&#8221; tricks automatically.</p>
<p>Neither of those counts as a legitimate difficulty.</p>
<p>Let me show this process in action. You were shown the proof that a composition of two injections is an injection. This is a classic example of a result with a proof that you should try not to learn. Of course, you need to know the proof, but instead of learning it, you should try to <em>find it easy</em>. If you don&#8217;t already find it easy, let me show you just how little you have to think of if you follow the advice I have just given.</p>
<p>First, we ought to write out more precisely the result we are trying to prove, making clear what the domains and ranges are, declaring all our variables, and so on.</p>
<p><strong>Proposition.</strong> <em>Let <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C' title='C' class='latex' /> be sets, let <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> and let <img src='http://s0.wp.com/latex.php?latex=g%3AB%5Cto+C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g:B&#92;to C' title='g:B&#92;to C' class='latex' />. Suppose that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> are injections. Then their composition <img src='http://s0.wp.com/latex.php?latex=g%5Ccirc+f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;circ f' title='g&#92;circ f' class='latex' /> is an injection.</em></p>
<p>OK, let&#8217;s get started with the proof. What is that we are trying to prove? Oh yes: that the function <img src='http://s0.wp.com/latex.php?latex=g%5Ccirc+f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;circ f' title='g&#92;circ f' class='latex' /> is an injection. And what does that mean? If you&#8217;ve got the definition at your fingertips you&#8217;ll remember instantly that we need to show that for every <img src='http://s0.wp.com/latex.php?latex=x%2Cx%27%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x,x&#039;&#92;in A' title='x,x&#039;&#92;in A' class='latex' />, if <img src='http://s0.wp.com/latex.php?latex=g%5Ccirc+f%28x%29%3Dg%5Ccirc+f%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;circ f(x)=g&#92;circ f(x&#039;)' title='g&#92;circ f(x)=g&#92;circ f(x&#039;)' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=x%3Dx%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=x&#039;' title='x=x&#039;' class='latex' />. Applying the &#8220;let&#8221; and &#8220;suppose&#8221; tricks, we write this:</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#039;' title='x&#039;' class='latex' /> be elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and suppose that <img src='http://s0.wp.com/latex.php?latex=g%5Ccirc+f%28x%29%3Dg%5Ccirc+f%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;circ f(x)=g&#92;circ f(x&#039;)' title='g&#92;circ f(x)=g&#92;circ f(x&#039;)' class='latex' />.</li>
<p>Now there&#8217;s something else that is good to do in a situation like this, which is to write the expressions <img src='http://s0.wp.com/latex.php?latex=g%5Ccirc+f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;circ f(x)' title='g&#92;circ f(x)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g%5Ccirc+f%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;circ f(x&#039;)' title='g&#92;circ f(x&#039;)' class='latex' /> in a more transparent way. We define the value of <img src='http://s0.wp.com/latex.php?latex=g%5Ccirc+f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;circ f' title='g&#92;circ f' class='latex' /> at <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> to be <img src='http://s0.wp.com/latex.php?latex=g%28f%28x%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(f(x))' title='g(f(x))' class='latex' />, so why don&#8217;t we write it like that, so we can see more easily what is being said? In fact, let&#8217;s rewrite the sentence I&#8217;ve just written, so that it reads as follows.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#039;' title='x&#039;' class='latex' /> be elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and suppose that <img src='http://s0.wp.com/latex.php?latex=g%28f%28x%29%29%3Dg%28f%28x%27%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(f(x))=g(f(x&#039;))' title='g(f(x))=g(f(x&#039;))' class='latex' />.</li>
<p>How did I know that it would be a good idea to rewrite the sentence like that? Well, I have to admit that it was a matter of experience, but there are a few situations where this kind of rewriting is almost always a good idea. This is one of them. Another is when you have a sentence of the form <img src='http://s0.wp.com/latex.php?latex=x%5Cin+E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in E' title='x&#92;in E' class='latex' />, where <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' /> is a set that has been defined in terms of some property. For example, the closed interval <img src='http://s0.wp.com/latex.php?latex=%5Ba%2Cb%5D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='[a,b]' title='[a,b]' class='latex' /> is defined as the set of all real numbers <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=a%5Cleq+x%5Cleq+b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;leq x&#92;leq b' title='a&#92;leq x&#92;leq b' class='latex' />. If you see the sentence <img src='http://s0.wp.com/latex.php?latex=x%5Cin+%5Ba%2Cb%5D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in [a,b]' title='x&#92;in [a,b]' class='latex' />, you will usually make life easier if you replace it by the equivalent sentence <img src='http://s0.wp.com/latex.php?latex=a%5Cleq+x%5Cleq+b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;leq x&#92;leq b' title='a&#92;leq x&#92;leq b' class='latex' />. As you go through the course, you should make a little collection of useful rewritings of this kind so that you do them without thinking. I&#8217;ll do what I can to get you started. </p>
<p>Right, where were we? We had written this.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#039;' title='x&#039;' class='latex' /> be elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and suppose that <img src='http://s0.wp.com/latex.php?latex=g%28f%28x%29%29%3Dg%28f%28x%27%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(f(x))=g(f(x&#039;))' title='g(f(x))=g(f(x&#039;))' class='latex' />.</li>
<p>Our aim at this point is to prove that <img src='http://s0.wp.com/latex.php?latex=x%3Dx%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=x&#039;' title='x=x&#039;' class='latex' />. (If you don&#8217;t find it obvious that that is our aim, then you should go back and reread this section.) Why should <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#039;' title='x&#039;' class='latex' /> be equal? It&#8217;s not clear, but then why would it be clear? After all, we haven&#8217;t used the hypotheses yet!</p>
<p>Our hypotheses were that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection and that <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> is an injection. If you have the definition of injection at your fingertips, then you will know that this tells you the following.</p>
<li>Whenever <img src='http://s0.wp.com/latex.php?latex=x%2Cx%27%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x,x&#039;&#92;in A' title='x,x&#039;&#92;in A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(x&#039;)' title='f(x)=f(x&#039;)' class='latex' />, it must be the case that <img src='http://s0.wp.com/latex.php?latex=x%3Dx%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=x&#039;' title='x=x&#039;' class='latex' />.</li>
<li>Whenever <img src='http://s0.wp.com/latex.php?latex=y%2Cy%27%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y,y&#039;&#92;in B' title='y,y&#039;&#92;in B' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g%28y%29%3Dg%28y%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(y)=g(y&#039;)' title='g(y)=g(y&#039;)' class='latex' />, it must be the case that <img src='http://s0.wp.com/latex.php?latex=y%3Dy%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=y&#039;' title='y=y&#039;' class='latex' />.</li>
<p>The first of these statements looks very promising, since the conclusion we are aiming for is precisely that two elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> are equal. But to use this statement we need to establish that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(x&#039;)' title='f(x)=f(x&#039;)' class='latex' />, and all we have so far is that <img src='http://s0.wp.com/latex.php?latex=g%28f%28x%29%29%3Dg%28f%28x%27%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(f(x))=g(f(x&#039;))' title='g(f(x))=g(f(x&#039;))' class='latex' />. So we&#8217;re not in a position to use the first hypothesis yet. </p>
<p>What about the second? for this we need to find two elements <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#039;' title='y&#039;' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=g%28y%29%3Dg%28y%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(y)=g(y&#039;)' title='g(y)=g(y&#039;)' class='latex' />. We know that <img src='http://s0.wp.com/latex.php?latex=g%28f%28x%29%29%3Dg%28f%28x%27%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(f(x))=g(f(x&#039;))' title='g(f(x))=g(f(x&#039;))' class='latex' />. Does that give us our <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#039;' title='y&#039;' class='latex' />?</p>
<p>Yes it does. Since <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> takes elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> to elements of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />, we know that <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=f%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x&#039;)' title='f(x&#039;)' class='latex' /> are elements of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />. So it would be insane not to see what happens if we set <img src='http://s0.wp.com/latex.php?latex=y%3Df%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=f(x)' title='y=f(x)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y%27%3Df%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#039;=f(x&#039;)' title='y&#039;=f(x&#039;)' class='latex' />. If we do so, then we find that <img src='http://s0.wp.com/latex.php?latex=g%28y%29%3Dg%28y%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(y)=g(y&#039;)' title='g(y)=g(y&#039;)' class='latex' /> and therefore that <img src='http://s0.wp.com/latex.php?latex=y%3Dy%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=y&#039;' title='y=y&#039;' class='latex' /> (because <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> is an injection). Remembering what <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#039;' title='y&#039;' class='latex' /> were, we realize that we have established that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(x&#039;)' title='f(x)=f(x&#039;)' class='latex' />. But <em>that</em> was what we were looking for in order to use the fact that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection, so now we find that <img src='http://s0.wp.com/latex.php?latex=x%3Dx%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=x&#039;' title='x=x&#039;' class='latex' />, as we wanted. </p>
<p>Now that was a rather long account of what might happen in your head, but the proof itself is very short. Here it is in full.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#039;' title='x&#039;' class='latex' /> be elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and suppose that <img src='http://s0.wp.com/latex.php?latex=g%28f%28x%29%29%3Dg%28f%28x%27%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(f(x))=g(f(x&#039;))' title='g(f(x))=g(f(x&#039;))' class='latex' />. Since <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> is an injection, it follows that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(x&#039;)' title='f(x)=f(x&#039;)' class='latex' />. Since <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection, it follows that <img src='http://s0.wp.com/latex.php?latex=x%3Dx%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=x&#039;' title='x=x&#039;' class='latex' />. This proves that <img src='http://s0.wp.com/latex.php?latex=g%5Ccirc+f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;circ f' title='g&#92;circ f' class='latex' /> is an injection.</li>
<p>Note that in that proof I did not write out the definition of an injection. I just had it in my head and I assumed that the reader did too. I also assumed that the reader was happy with the &#8220;let&#8221; and &#8220;suppose&#8221; tricks. </p>
<p>Note also that, as the above proof makes very clear, one can think of &#8220;<img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection&#8221; as a kind of cancellation law. It says that whenever you write something of the form <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(x&#039;)' title='f(x)=f(x&#039;)' class='latex' /> you can &#8220;cancel&#8221; the <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' />s on both sides to get <img src='http://s0.wp.com/latex.php?latex=x%3Dx%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=x&#039;' title='x=x&#039;' class='latex' />. So the phrase &#8220;since <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection&#8221; can be read as &#8220;I&#8217;m allowed to cancel the <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' />s &#8212; honest.&#8221;</p>
<p>Here is a second example. If you have understood the above, then you might like to try it yourself before reading on: you should find it easy. Recall that a function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> is <em>strictly increasing</em> if <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Cf%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&lt;f(y)' title='f(x)&lt;f(y)' class='latex' /> whenever <img src='http://s0.wp.com/latex.php?latex=x%3Cy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;y' title='x&lt;y' class='latex' />. </p>
<p><strong>Exercise.</strong> Let <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> be a strictly increasing function. Prove that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection.</p>
<p>The point of this exercise is to write out the kind of short, crisp, utterly rigorous and non-wordy argument of the kind that I used to prove that a composition of injections is an injection, using the kinds of proof-generating techniques I have been talking about. Merely satisfying yourself that the statement is &#8220;clearly true&#8221; is not enough. I&#8217;d almost go as far as to say that if you have to stop and think why the result is true (rather than just proceeding on autopilot and never getting stuck), then you haven&#8217;t done it properly. While that&#8217;s a slight exaggeration, I am at least happy to say that thought is not needed, and even that there is a danger that too much thought will lead you astray.</p>
<p>OK, let&#8217;s try operating on autopilot. We are trying to show that our strictly increasing function <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection. So we write out the following first line without thinking.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> be real numbers such that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(y)' title='f(x)=f(y)' class='latex' />.</li>
<p>Hmm &#8230; we seem to be in trouble, because there isn&#8217;t an obvious way of using the fact that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is increasing. Or rather, there is, but the argument is more like one by contradiction: we want to say to ourselves that the fact that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(y)' title='f(x)=f(y)' class='latex' /> tells us that <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> can&#8217;t be less than <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> (or else <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> would be less than <img src='http://s0.wp.com/latex.php?latex=f%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(y)' title='f(y)' class='latex' />) and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> can&#8217;t be less than <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> (or else <img src='http://s0.wp.com/latex.php?latex=f%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(y)' title='f(y)' class='latex' /> would be less than <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' />). So the only possibility left is that <img src='http://s0.wp.com/latex.php?latex=x%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=y' title='x=y' class='latex' />. </p>
<p>But the fact that we&#8217;ve ended up arguing like that suggests that we would have been better off going for the contrapositive in the first place. That is, we could take definition (iv) as the one we are using and start the argument as follows.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> be real numbers with <img src='http://s0.wp.com/latex.php?latex=x%5Cne+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;ne y' title='x&#92;ne y' class='latex' />.</li>
<p>How do we use the fact that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is strictly increasing? Well, if <img src='http://s0.wp.com/latex.php?latex=x%5Cne+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;ne y' title='x&#92;ne y' class='latex' /> then either <img src='http://s0.wp.com/latex.php?latex=x%3Cy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;y' title='x&lt;y' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=y%3Cx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&lt;x' title='y&lt;x' class='latex' />. That gives us our way in. So the next line of the proof is as follows.</p>
<li>Then either <img src='http://s0.wp.com/latex.php?latex=x%3Cy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;y' title='x&lt;y' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=y%3Cx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&lt;x' title='y&lt;x' class='latex' />.</li>
<p>Now recall what I said in the post on AND and OR: if you want to use a statement with an OR in the middle, then you have to split into cases and show that the conclusion holds in all cases. So that&#8217;s what we do next.</p>
<li>Since <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is strictly increasing, if <img src='http://s0.wp.com/latex.php?latex=x%3Cy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;y' title='x&lt;y' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Cf%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&lt;f(y)' title='f(x)&lt;f(y)' class='latex' />, and if <img src='http://s0.wp.com/latex.php?latex=y%3C+x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&lt; x' title='y&lt; x' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=f%28y%29%3Cf%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(y)&lt;f(x)' title='f(y)&lt;f(x)' class='latex' />. Either way, <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cne+f%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;ne f(y)' title='f(x)&#92;ne f(y)' class='latex' />. Therefore, <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection.</li>
<p>I think that is the most natural proof of this statement, but just out of interest, here&#8217;s a slightly different way of doing it. It&#8217;s not all that different really &#8212; it just shifts the taking of the contrapositive to an earlier stage and packages it up as a lemma. (A <em>lemma</em>, by the way, is a statement that you establish separately with a view to using it to prove other statements.)</p>
<p><strong>Lemma.</strong> <em>Let <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> be a strictly increasing function and let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> be real numbers. If <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cleq+f%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;leq f(y)' title='f(x)&#92;leq f(y)' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=x%5Cleq+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;leq y' title='x&#92;leq y' class='latex' />.</em></p>
<p><strong>Proof.</strong> This is simply the contrapositive of the statement that if <img src='http://s0.wp.com/latex.php?latex=y%3Cx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&lt;x' title='y&lt;x' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=f%28y%29%3Cf%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(y)&lt;f(x)' title='f(y)&lt;f(x)' class='latex' />.</p>
<p>With that lemma to hand, we can now argue as follows.</p>
<p><strong>Proposition.</strong> <em>Let <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> be a strictly increasing function from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection.</em></p>
<p><strong>Proof.</strong> Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> be real numbers with <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(y)' title='f(x)=f(y)' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cleq+f%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;leq f(y)' title='f(x)&#92;leq f(y)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=f%28y%29%5Cleq+f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(y)&#92;leq f(x)' title='f(y)&#92;leq f(x)' class='latex' />. By the lemma, it follows that <img src='http://s0.wp.com/latex.php?latex=x%5Cleq+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;leq y' title='x&#92;leq y' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y%5Cleq+x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;leq x' title='y&#92;leq x' class='latex' />. Therefore, <img src='http://s0.wp.com/latex.php?latex=x%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=y' title='x=y' class='latex' />. <img src='http://s0.wp.com/latex.php?latex=%5Csquare&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;square' title='&#92;square' class='latex' /></p>
<p>Notice that once I had packaged up the contrapositive-taking inside the lemma, I was able to use the &#8220;let&#8221; and &#8220;suppose&#8221; tricks with Definition (ii) rather than Definition (iv). </p>
<p>4. <em>How to prove that a function is a surjection.</em></p>
<p>I&#8217;ll be briefer about this, since I&#8217;ve made several general points in the previous section that I don&#8217;t need to repeat. </p>
<p>Suppose, then, that we are given a function <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> and are asked to prove that it is a surjection. How should the proof begin? Once again, we should bring the relevant definition to the front of our mind, which is as follows:</p>
<li>A function <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> is a <em>surjection</em> if for every <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> there exists <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=y' title='f(x)=y' class='latex' />. </li>
<p>Since what we are trying to prove begins, &#8220;For every <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' />,&#8221; we can apply the &#8220;let&#8221; trick. Therefore, the first line of the proof should be, &#8220;Let <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' />.&#8221; Once we&#8217;ve written that, our task is to prove that there is some <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=y' title='f(x)=y' class='latex' />. That is, we are trying to solve an equation.</p>
<p>Up to now I may have given the impression that I don&#8217;t believe that there is such a thing as genuine mathematical difficulty. But that is not at all what I think. There are many apparent difficulties that shouldn&#8217;t be difficulties at all, but as a general rule, <em>finding things</em> is where the going starts to get tough. Sometimes that isn&#8217;t true and those things are handed to you on a plate, but sometimes very sophisticated mathematics is needed. Here we have an arbitrary <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> and we need to show that there is some <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=y' title='f(x)=y' class='latex' />. Let me give a few examples, some where it is easy and some where it is hard.</p>
<p>Example 1. Let <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> be given by the formula <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dax%2Bb&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=ax+b' title='f(x)=ax+b' class='latex' />, where <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> are real numbers and <img src='http://s0.wp.com/latex.php?latex=a%5Cne+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;ne 0' title='a&#92;ne 0' class='latex' />. Prove that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a surjection.</p>
<p>Solution: Let <img src='http://s0.wp.com/latex.php?latex=y%5Cin%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in&#92;mathbb{R}' title='y&#92;in&#92;mathbb{R}' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=f%28%28y-b%29%2Fa%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f((y-b)/a)=y' title='f((y-b)/a)=y' class='latex' />. The result follows.</p>
<p>To come up with that proof I solved the equation <img src='http://s0.wp.com/latex.php?latex=ax%2Bb%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ax+b=y' title='ax+b=y' class='latex' /> for <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />. </p>
<p>Example 2. Let <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> be given by the formula <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dx%5E5%2Bx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=x^5+x' title='f(x)=x^5+x' class='latex' />. Prove that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a surjection.</p>
<p>Solution: Let <img src='http://s0.wp.com/latex.php?latex=y%5Cin%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in&#92;mathbb{R}' title='y&#92;in&#92;mathbb{R}' class='latex' />. If <img src='http://s0.wp.com/latex.php?latex=x%3E%7Cy%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&gt;|y|' title='x&gt;|y|' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=x%5E5%2Bx%3E%7Cy%7C%5Cgeq+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x^5+x&gt;|y|&#92;geq y' title='x^5+x&gt;|y|&#92;geq y' class='latex' />, and if <img src='http://s0.wp.com/latex.php?latex=x%3C-%7Cy%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;-|y|' title='x&lt;-|y|' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=x%5E5%2Bx%3C-%7Cy%7C%5Cleq+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x^5+x&lt;-|y|&#92;leq y' title='x^5+x&lt;-|y|&#92;leq y' class='latex' />. Therefore, we can find <img src='http://s0.wp.com/latex.php?latex=x_1%3Cx_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x_1&lt;x_2' title='x_1&lt;x_2' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28x_1%29%3Cy%3Cf%28x_2%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x_1)&lt;y&lt;f(x_2)' title='f(x_1)&lt;y&lt;f(x_2)' class='latex' />. Since <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is continuous, there must be some <img src='http://s0.wp.com/latex.php?latex=u&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='u' title='u' class='latex' /> between <img src='http://s0.wp.com/latex.php?latex=x_1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x_1' title='x_1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x_2' title='x_2' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28u%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(u)=y' title='f(u)=y' class='latex' />.</p>
<p>Here I used the concept of continuity and a result called the intermediate value theorem: you will cover these in Analysis I next term. I used that result because I don&#8217;t know how to solve the equation <img src='http://s0.wp.com/latex.php?latex=x%5E5%2Bx%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x^5+x=y' title='x^5+x=y' class='latex' /> (and neither do you). So I contented myself with proving that <em>there exists</em> a solution.</p>
<p>Example 3. A composition of two surjections is a surjection.</p>
<p>Solution: [Obtaining this argument is an entirely thought-free process. First we set up the problem properly by giving names to things.] Let <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C' title='C' class='latex' /> be sets and let <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=g%3AB%5Cto+C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g:B&#92;to C' title='g:B&#92;to C' class='latex' /> be surjections. [Next, we apply the "let" trick to the statement we are trying to prove, which is that <img src='http://s0.wp.com/latex.php?latex=g%5Ccirc+f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;circ f' title='g&#92;circ f' class='latex' /> is a surjection, which is the statement that for every <img src='http://s0.wp.com/latex.php?latex=z%5Cin+C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='z&#92;in C' title='z&#92;in C' class='latex' /> there exists <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=g%5Ccirc+f%28x%29%3Dz&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;circ f(x)=z' title='g&#92;circ f(x)=z' class='latex' /> -- which we rewrite as <img src='http://s0.wp.com/latex.php?latex=g%28f%28x%29%29%3Dz&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(f(x))=z' title='g(f(x))=z' class='latex' />.] Let <img src='http://s0.wp.com/latex.php?latex=z%5Cin+C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='z&#92;in C' title='z&#92;in C' class='latex' />. [We can't immediately see an <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> that will do, so let's turn to the hypotheses that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a surjection and that <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> is a surjection. The first one doesn't help us much but the second does.] Since <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> is a surjection there exists <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=g%28y%29%3Dz&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(y)=z' title='g(y)=z' class='latex' />. [Now we've got a point in <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> so we can use the hypothesis that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a surjection.] Since <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a surjection, there exists <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=y' title='f(x)=y' class='latex' />. [Note that I went from "there exists <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />" to behaving as though I had fixed <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />, without actually saying that that was what I was doing. See <a href="http://gowers.wordpress.com/2011/10/07/basic-logic-tips-for-handling-variables/">the post on handling variables</a> for a discussion of this. As for the proof we are trying to write, it's basically over.] But then <img src='http://s0.wp.com/latex.php?latex=g%28f%28x%29%29%3Dg%28y%29%3Dz&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(f(x))=g(y)=z' title='g(f(x))=g(y)=z' class='latex' />. Therefore, <img src='http://s0.wp.com/latex.php?latex=g%5Ccirc+f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g&#92;circ f' title='g&#92;circ f' class='latex' /> is a surjection.</p>
<p>I mentioned that one way of thinking of the hypothesis that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a injection is that it tells you that you can cancel <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> in any equation of the form <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Df%28x%27%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=f(x&#039;)' title='f(x)=f(x&#039;)' class='latex' />. You can think of the hypothesis that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a surjection as saying that the equation <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=y' title='f(x)=y' class='latex' /> always has at least one solution. </p>
<p>Example 4. The <em>closed unit disc</em> <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> is the set of all points <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29%5Cin%5Cmathbb%7BR%7D%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)&#92;in&#92;mathbb{R}^2' title='(x,y)&#92;in&#92;mathbb{R}^2' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=x%5E2%2By%5E2%5Cleq+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x^2+y^2&#92;leq 1' title='x^2+y^2&#92;leq 1' class='latex' />. Let <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> be a continuous function from <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28x%2Cy%29%3D%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x,y)=(x,y)' title='f(x,y)=(x,y)' class='latex' /> whenever <img src='http://s0.wp.com/latex.php?latex=x%5E2%2By%5E2%3D1.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x^2+y^2=1.' title='x^2+y^2=1.' class='latex' /> (That is, <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> &#8220;fixes the boundary&#8221;.) Prove that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a surjection.</p>
<p>Solution: This is meant to illustrate that proving that a function is a surjection sometimes involves maths that is well beyond what you have covered so far. It relies on a result called <a href="http://en.wikipedia.org/wiki/Brouwer_fixed_point_theorem">Brouwer&#8217;s fixed point theorem</a>. That result says that every continuous function from <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> has a fixed point. That is, if <img src='http://s0.wp.com/latex.php?latex=g%3AD%5Cto+D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g:D&#92;to D' title='g:D&#92;to D' class='latex' /> is continuous, then there must be some <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29%5Cin+D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)&#92;in D' title='(x,y)&#92;in D' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=g%28x%2Cy%29%3D%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(x,y)=(x,y)' title='g(x,y)=(x,y)' class='latex' />. </p>
<p>We make the decision to prove the result by contradiction and Brouwer&#8217;s theorem. That is, we shall assume that there exists a continuous function <img src='http://s0.wp.com/latex.php?latex=f%3AD%5Cto+D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:D&#92;to D' title='f:D&#92;to D' class='latex' /> that fixes the boundary and is not a surjection, and we shall attempt to prove, from that, that there is a continuous function from <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> with no fixed point, which will contradict Brouwer&#8217;s fixed point theorem. (Of course, <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> isn&#8217;t such a function, since every point on the boundary of <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> is a fixed point of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' />.)</p>
<p>Now, we are assuming that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is not a surjection, so we need to flex our logical muscles and negate a statement with quantifiers. The statement is this.</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=y%5Cin+D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in D' title='y&#92;in D' class='latex' /> there exists <img src='http://s0.wp.com/latex.php?latex=x%5Cin+D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in D' title='x&#92;in D' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=y' title='f(x)=y' class='latex' />.</li>
<p>Note that I&#8217;ve switched to representing points by single letters rather than giving them in coordinate form. So in the above statement and from now on <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> are points in the circle rather than coordinates of points.</p>
<p>Following the mechanical rules set out in <a href="http://gowers.wordpress.com/2011/10/02/basic-logic-relationships-between-statements-negation/">the post on negation</a>, we obtain the negation, which is this.</p>
<li>There exists <img src='http://s0.wp.com/latex.php?latex=y%5Cin+D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in D' title='y&#92;in D' class='latex' /> such that for every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in D' title='x&#92;in D' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cne+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;ne y' title='f(x)&#92;ne y' class='latex' />.</li>
<p>Now we want to define a continuous function with no fixed point. At this point we are trying to <em>find</em> or <em>make</em> something, so if we run into difficulties, we at least have the comfort of knowing that they are not fake difficulties. I happen to know how this one goes because at some point in the past I was shown it &#8230;</p>
<p>The idea is this. We shall do things in two stages. The first stage is to define something called a <em>continuous retraction</em> from <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> to its boundary. That means a continuous function <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> from <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> to the boundary of <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> that sends each boundary point to itself. (If you&#8217;re worried that such a function can&#8217;t exist, then don&#8217;t, because you&#8217;re right that it can&#8217;t. We&#8217;re writing a proof by contradiction, and that means that we&#8217;re in nonsense world.) For each <img src='http://s0.wp.com/latex.php?latex=x%5Cin+D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in D' title='x&#92;in D' class='latex' /> we need to say what boundary point <img src='http://s0.wp.com/latex.php?latex=h%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h(x)' title='h(x)' class='latex' /> is. To do that, we take the point <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> given above (by the negation of the statement that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a surjection) and draw a straight line segment from <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' />. Note that our hypothesis tells us that <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' /> are different points. We then continue along that line until we hit the boundary of <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> and let <img src='http://s0.wp.com/latex.php?latex=h%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h(x)' title='h(x)' class='latex' /> be the point where we hit. </p>
<p>If you know the right bits of maths (which you don&#8217;t), then it isn&#8217;t hard to show that <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> is continuous. Also, if <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is on the boundary, then <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=x' title='f(x)=x' class='latex' /> (by hypothesis), from which it follows that <img src='http://s0.wp.com/latex.php?latex=h%28x%29%3Dx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h(x)=x' title='h(x)=x' class='latex' /> (by the way we defined <img src='http://s0.wp.com/latex.php?latex=h%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h(x)' title='h(x)' class='latex' />). So we&#8217;ve got our continuous retraction.</p>
<p>To create a continuous function with no fixed point, we just rotate the continuous retraction. That is, we just define <img src='http://s0.wp.com/latex.php?latex=g%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(x)' title='g(x)' class='latex' /> to be what you get if you rotate <img src='http://s0.wp.com/latex.php?latex=h%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h(x)' title='h(x)' class='latex' /> through, say, 90 degrees anticlockwise about the origin. Then the only way <img src='http://s0.wp.com/latex.php?latex=g%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(x)' title='g(x)' class='latex' /> could conceivably equal <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is if <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is itself on the boundary. But then <img src='http://s0.wp.com/latex.php?latex=g%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(x)' title='g(x)' class='latex' /> isn&#8217;t <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />, since it&#8217;s a rotation of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />. </p>
<p><strong>Two important contrasts.</strong></p>
<p><em>Concrete problems versus abstract problems.</em></p>
<p>Amongst the surjection proofs I&#8217;ve just discussed, there are two very different kinds. One is where I have a specific function, such as <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dx%5E5%2Bx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=x^5+x' title='f(x)=x^5+x' class='latex' /> (from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' />) and I show that it is a surjection. This I call a <em>concrete</em> problem. The other is where I have an unknown function about which I am given some information, and my task is to deduce from that information that it is a surjection. This I call an <em>abstract</em> problem.</p>
<p>It might be better to say that there is a spectrum with fully concrete at one end and fully abstract at the other. For example, when I proved that the function <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dax%2Bb&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=ax+b' title='f(x)=ax+b' class='latex' /> was a surjection, that felt pretty concrete, but I didn&#8217;t actually know what <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> were. I could have rephrased the problem to sound more abstract, by saying, &#8220;Let <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> be a linear function with gradient not equal to zero. Prove that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a surjection.&#8221;</p>
<p><em>Finding things versus proving that they exist.</em></p>
<p>I&#8217;ve already drawn attention to this. For the first problem above, I actually <em>found</em> <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=y' title='f(x)=y' class='latex' />. For all the other problems, I proved that such an <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> exists without actually finding one.</p>
<p><strong>Graphs of injections and surjections.</strong></p>
<p>I have encouraged you <em>not to think</em> when starting to write proofs that involve injections and surjections. This may go against the grain a little: surely one should be trying to <em>understand</em> injections and surjections rather than mechanically writing proofs without having the faintest idea what they mean.</p>
<p>My actual view about this is, not surprisingly, that both understanding and mechanical fluency are important. I am stressing the mechanical fluency side of things because my experience supervising suggests that where people run into problems it is the mechanical side that needs to be improved: the picture in their brains isn&#8217;t too bad, but they have trouble turning observations about this picture into a correctly written proof.</p>
<p>Nevertheless, it is worth saying just a little bit about how to visualize injections and surjections, at least in the case of functions from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' />. (Once you understand these, it should be reasonably easy to transfer your understanding to more general functions.)</p>
<p>The way to do this is to think about <em>graphs</em>. Let&#8217;s start with a very basic question: which subsets of the plane <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}^2' title='&#92;mathbb{R}^2' class='latex' /> are graphs of functions from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' />? Well, the point of a graph is that for each <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> coordinate there is exactly one <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> coordinate such that <img src='http://s0.wp.com/latex.php?latex=%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,y)' title='(x,y)' class='latex' /> belongs to the graph. If the function in question is <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' />, then we say that <img src='http://s0.wp.com/latex.php?latex=y%3Df%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=f(x)' title='y=f(x)' class='latex' />. (Usually we think about this the other way round: we start with the function and draw the graph. Here I&#8217;m thinking about the graph and saying what the function is that it is a graph of.)</p>
<p>In visual terms, what this is saying is that a subset <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}^2' title='&#92;mathbb{R}^2' class='latex' /> is the graph of a function if and only if it satisfies the following condition.</p>
<li>Every vertical line intersects <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> in exactly one point.</li>
<p>Note that this is really saying two things: each vertical line must intersect <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> in <em>at least</em> one point (or we wouldn&#8217;t have a value for <img src='http://s0.wp.com/latex.php?latex=f%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)' title='f(x)' class='latex' />) and also <em>at most</em> one point (or we would have more than one value &#8212; which isn&#8217;t allowed). </p>
<p>Now let&#8217;s think about injections. For the purposes of this discussion it will be convenient to use the following definition.</p>
<li>A function <img src='http://s0.wp.com/latex.php?latex=f%3AA%5Cto+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:A&#92;to B' title='f:A&#92;to B' class='latex' /> is an <em>injection</em> if for every <img src='http://s0.wp.com/latex.php?latex=b%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;in B' title='b&#92;in B' class='latex' /> there is at most one <img src='http://s0.wp.com/latex.php?latex=a%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in A' title='a&#92;in A' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28a%29%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(a)=b' title='f(a)=b' class='latex' />.</li>
<p>If we&#8217;re given the graph of a function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' />, how do we find all the solutions to the equation <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=b' title='f(x)=b' class='latex' /> for some fixed <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' />? We draw the horizontal line <img src='http://s0.wp.com/latex.php?latex=y%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=b' title='y=b' class='latex' />, and at each point where it cuts the graph we have a solution. Why? Because the line <img src='http://s0.wp.com/latex.php?latex=y%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=b' title='y=b' class='latex' /> will cut the graph at a point of the form <img src='http://s0.wp.com/latex.php?latex=%28x%2Cb%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(x,b)' title='(x,b)' class='latex' />, and since the graph is of the function <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' />, that tells us that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=b' title='f(x)=b' class='latex' />. </p>
<p>Therefore, if we want there to be at most one solution, that tells us that the line <img src='http://s0.wp.com/latex.php?latex=y%3Db&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=b' title='y=b' class='latex' /> cuts the graph in at most one place. That gives us the following criterion for the graph of a function to be the graph of an injection.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> be the graph of a function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection if and only if no horizontal line cuts <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> in more than one place.</li>
<p>A very similar line of reasoning leads to the following companion criterion for a function to be a surjection.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> be the graph of a function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an surjection if and only if every horizontal line cuts <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />.</li>
<p>Putting those two together, we get a criterion for a function to be a bijection.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> be the graph of a function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a bijection if and only if every horizontal line cuts <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> in exactly one place.</li>
<p>A final remark is this. If a function <img src='http://s0.wp.com/latex.php?latex=f%3A%5Cmathbb%7BR%7D%5Cto%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' title='f:&#92;mathbb{R}&#92;to&#92;mathbb{R}' class='latex' /> has an inverse, then the graph of the inverse is obtained from the graph of <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> by reflecting it in the line <img src='http://s0.wp.com/latex.php?latex=y%3Dx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=x' title='y=x' class='latex' />, as you have probably seen at A level. But if you reflect the graph of a function, you don&#8217;t necessarily get the graph of a function. When <em>do</em> you get the graph of a function? Well, after the reflection you need every vertical line to cut your set in exactly one place, so before the reflection you need every <em>horizontal</em> line to cut the set in exactly one place. We can split that up into two conditions: cutting in at least one place and cutting in at most one place. That generates the definitions of surjection and injection. So if you ever stop to wonder where those definitions come from and why they are important, the answer is that they come from a consideration of what we need if we want to invert a function.</p>
<p><strong>How to prove that a function is a bijection.</strong></p>
<p>Do I need a section here? Surely if I&#8217;ve discussed how to show that a function is an injection and I&#8217;ve discussed how to show that a function is a surjection, then that&#8217;s all that&#8217;s needed. If you want to show that a function is a bijection then you have to show that it is an injection and that it is a surjection. Right?</p>
<p>Wrong. Suppose you were asked to prove that the function <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3D3x%2B5&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=3x+5' title='f(x)=3x+5' class='latex' /> is a bijection from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' />. Would your argument look like this?</p>
<p>First let us show that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection. Suppose that <img src='http://s0.wp.com/latex.php?latex=3x%2B5%3D3x%27%2B5&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='3x+5=3x&#039;+5' title='3x+5=3x&#039;+5' class='latex' />. Subtracting 5 from both sides and then dividing both sides by 3 we find that <img src='http://s0.wp.com/latex.php?latex=x%3Dx%27&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=x&#039;' title='x=x&#039;' class='latex' />. Now let us show that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a surjection. Let <img src='http://s0.wp.com/latex.php?latex=y%5Cin%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in&#92;mathbb{R}' title='y&#92;in&#92;mathbb{R}' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=f%28%28y-5%29%2F3%29%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f((y-5)/3)=y' title='f((y-5)/3)=y' class='latex' />. We have shown that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is an injection and a surjection. It follows that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a bijection.</p>
<p>Perhaps it would. But then you have not yet picked up a very important mathematical principle, which is this.</p>
<li>If you are ever told that P is equivalent to Q, then from that moment on, if you are asked to prove P you have the option of proving Q instead. Moreover, proving Q is often easier.</li>
<p>Do we have any statement equivalent to &#8220;<img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a bijection&#8221;? Yes we do. A function is a bijection if and only if it has an inverse. Let&#8217;s see what happens if we try to use this. We&#8217;re trying to show that the function <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3D3x%2B5&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=3x+5' title='f(x)=3x+5' class='latex' /> is a bijection. Is it easy to show that it has an inverse? Yes, the function <img src='http://s0.wp.com/latex.php?latex=g%28y%29%3D%28y-5%29%2F3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g(y)=(y-5)/3' title='g(y)=(y-5)/3' class='latex' /> is the inverse. Therefore, <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a bijection. </p>
<p>I&#8217;m not saying that you should always find an inverse if you want to show that a function is a bijection. What I am saying is that if it&#8217;s easy to find an inverse, then that will give you a better proof. </p>
<p>This tip holds not just for concrete problems (where you have to show that a specific function is a bijection) but also for more abstract ones (where you are given a function with certain properties and need to show that it is a bijection). And the general message that you shouldn&#8217;t always go back to the definition can be applied throughout mathematics. Indeed, so important is this principle that I plan to devote an entire post to it at some point.</p>
<p>There is one aspect of injections and surjections that I have not discussed here, which is how to show that a function is <em>not</em> an injection or a surjection. I hope to discuss that as part of a more general post on counterexamples, but let me end with a proof that the function <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dx%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=x^2' title='f(x)=x^2' class='latex' /> is not an injection or a surjection, when considered as a function from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' />.</p>
<p>Proof that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is not an injection: <img src='http://s0.wp.com/latex.php?latex=f%281%29%3Df%28-1%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(1)=f(-1)' title='f(1)=f(-1)' class='latex' />.</p>
<p>Proof that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is not a surjection: there is no <img src='http://s0.wp.com/latex.php?latex=x%5Cin%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in&#92;mathbb{R}' title='x&#92;in&#92;mathbb{R}' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=x%5E2%3D-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x^2=-1' title='x^2=-1' class='latex' />.</p>
<p>Note the total lack of waffle, and the fact that to prove my point I just needed one thing to go wrong in both cases.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3480/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3480/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3480/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3480/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3480/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3480/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3480/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3480/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3480/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3480/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3480/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3480/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3480/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3480/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3480&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/10/11/injections-surjections-and-all-that/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Basic logic &#8212; summary</title>
		<link>http://gowers.wordpress.com/2011/10/09/basic-logic-summary/</link>
		<comments>http://gowers.wordpress.com/2011/10/09/basic-logic-summary/#comments</comments>
		<pubDate>Sun, 09 Oct 2011 17:14:48 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Basic logic]]></category>
		<category><![CDATA[Cambridge teaching]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3332</guid>
		<description><![CDATA[Here is the promised post that I hope will be easier to refer back to than the much longer posts I&#8217;ve written on individual aspects of basic logic. What I imagine people doing is reading the longer posts and using this one to jog their memories later. If you can think of any important points [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3332&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Here is the promised post that I hope will be easier to refer back to than the much longer posts I&#8217;ve written on individual aspects of basic logic. What I imagine people doing is reading the longer posts and using this one to jog their memories later. If you can think of any important points that I made in earlier posts and have forgotten to mention here, I&#8217;d be grateful to know of them.</p>
<p>Once again, the main topics dealt with were these.</p>
<p><strong>Logical connectives.</strong> AND, OR, NOT, IMPLIES (or in symbols, <img src='http://s0.wp.com/latex.php?latex=%5Cwedge%2C%5Cvee%2C%5Cneg%2C%5Cimplies&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;wedge,&#92;vee,&#92;neg,&#92;implies' title='&#92;wedge,&#92;vee,&#92;neg,&#92;implies' class='latex' />).</p>
<p><strong>Quantifiers.</strong> &#8220;for every&#8221; and &#8220;there exists&#8221; (or in symbols, <img src='http://s0.wp.com/latex.php?latex=%5Cforall&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall' title='&#92;forall' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cexists&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists' title='&#92;exists' class='latex' />).</p>
<p><strong>Relationships between statements.</strong> Negation, converse, contrapositive.<br />
<span id="more-3332"></span></p>
<hr />
<p>I think I&#8217;ll just write down a numbered list of points.</p>
<ol>
<li>AND has a similar meaning in mathematics to its meaning in ordinary English, but some care is needed if you use it to connect anything other than <em>statements</em>. If you do do that, then make sure you know how to translate what you have written into more formal mathematical language where AND connects statements only.</li>
<li>OR is slightly more different from its ordinary English meaning than AND, since in mathematics it is always the inclusive OR. One should think of <img src='http://s0.wp.com/latex.php?latex=P%5Cvee+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;vee Q' title='P&#92;vee Q' class='latex' /> as saying &#8220;P is true or Q is true or both&#8221; or &#8220;At least one of P and Q is true&#8221; (or both).</li>
<li>AND and OR are associative. That means that <img src='http://s0.wp.com/latex.php?latex=%28P%5Cwedge+Q%29%5Cwedge+R%5Ciff+P%5Cwedge%28Q%5Cwedge+R%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(P&#92;wedge Q)&#92;wedge R&#92;iff P&#92;wedge(Q&#92;wedge R)' title='(P&#92;wedge Q)&#92;wedge R&#92;iff P&#92;wedge(Q&#92;wedge R)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%28P%5Cvee+Q%29%5Cvee+R%5Ciff+P%5Cvee%28Q%5Cvee+R%29.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(P&#92;vee Q)&#92;vee R&#92;iff P&#92;vee(Q&#92;vee R).' title='(P&#92;vee Q)&#92;vee R&#92;iff P&#92;vee(Q&#92;vee R).' class='latex' /> In practice what this tells us is that we don&#8217;t need brackets. If we connect a bunch of statements with ANDs it means that all those statements are true, and if we connect them with ORs it means that at least one of them is true.</li>
<li>However, brackets matter very much indeed when you mix ANDs and ORs. You are not allowed to write <img src='http://s0.wp.com/latex.php?latex=P%5Cwedge+Q%5Cvee+R&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;wedge Q&#92;vee R' title='P&#92;wedge Q&#92;vee R' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=P%5Cvee+Q%5Cwedge+R&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;vee Q&#92;wedge R' title='P&#92;vee Q&#92;wedge R' class='latex' /> (or more wordy statements of the same kind). You have to put the brackets in to make clear in which order you are doing things.</li>
<li>AND is distributive over OR and vice versa. That means that <img src='http://s0.wp.com/latex.php?latex=P%5Cwedge%28Q%5Cvee+R%29%5Ciff%28P%5Cwedge+Q%29%5Cvee%28P%5Cwedge+R%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;wedge(Q&#92;vee R)&#92;iff(P&#92;wedge Q)&#92;vee(P&#92;wedge R)' title='P&#92;wedge(Q&#92;vee R)&#92;iff(P&#92;wedge Q)&#92;vee(P&#92;wedge R)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=P%5Cvee%28Q%5Cwedge+R%29%5Ciff%28P%5Cvee+Q%29%5Cwedge%28P%5Cvee+R%29.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;vee(Q&#92;wedge R)&#92;iff(P&#92;vee Q)&#92;wedge(P&#92;vee R).' title='P&#92;vee(Q&#92;wedge R)&#92;iff(P&#92;vee Q)&#92;wedge(P&#92;vee R).' class='latex' /></li>
<li>NOT is again similar but not identical to the ordinary English word. If <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> is a statement, then the statement <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P' title='&#92;neg P' class='latex' /> should be thought of as saying &#8220;It is not the case that <img src='http://s0.wp.com/latex.php?latex=P.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P.' title='P.' class='latex' />&#8221; The main thing to remember is that <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P' title='&#92;neg P' class='latex' /> is not the <em>opposite</em> of <img src='http://s0.wp.com/latex.php?latex=P.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P.' title='P.' class='latex' /> For example, if <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> is the statement &#8220;<img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is the largest element of <img src='http://s0.wp.com/latex.php?latex=A%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A,' title='A,' class='latex' />&#8221; then <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P' title='&#92;neg P' class='latex' /> has nothing to do with the smallest element of <img src='http://s0.wp.com/latex.php?latex=A.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A.' title='A.' class='latex' /> It is simply the statement that <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is not the largest element of <img src='http://s0.wp.com/latex.php?latex=A.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A.' title='A.' class='latex' /></li>
<li>A good rule for checking whether your use of NOTs is correct is this: NOT turns weak statements into strong statements and vice versa.</li>
<li>de Morgan&#8217;s laws are that <img src='http://s0.wp.com/latex.php?latex=%5Cneg%28P%5Cvee+Q%29%5Ciff+%5Cneg+P%5Cwedge%5Cneg+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg(P&#92;vee Q)&#92;iff &#92;neg P&#92;wedge&#92;neg Q' title='&#92;neg(P&#92;vee Q)&#92;iff &#92;neg P&#92;wedge&#92;neg Q' class='latex' /> and that <img src='http://s0.wp.com/latex.php?latex=%5Cneg%28P%5Cwedge+Q%29%5Ciff%5Cneg+P%5Cvee%5Cneg+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg(P&#92;wedge Q)&#92;iff&#92;neg P&#92;vee&#92;neg Q' title='&#92;neg(P&#92;wedge Q)&#92;iff&#92;neg P&#92;vee&#92;neg Q' class='latex' />.</li>
<li>IMPLIES is, of all the connectives, the one that least resembles its ordinary English counterpart.</li>
<li>If P and Q are specific statements, such as &#8220;23 is a prime number&#8221; or &#8220;there are infinitely many primes&#8221; then <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' /> does not mean that there is any interesting relationship between the two statements. It merely means that it is <em>not</em> the case that P is true and Q is false. (An alternative definition is that if P is true then Q must be true. Though perfectly valid, this definition doesn&#8217;t convey quite as vividly just how unnecessary it is for the truth of P to &#8220;cause&#8221; the truth of Q.)</li>
<li>However, most of the time, we do not use IMPLIES to relate specific statements. Rather, we use them to relate <em>general</em> statements &#8212; that is, statements that involve unknown variables. And once we do that, the picture becomes more complicated.</li>
<li>The reason for the complication is that <em>general statements often look like specific cases</em>. For example, if I say that &#8220;<img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is a prime greater than 2&#8243; implies &#8220;<img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is odd&#8221;, then it looks as though I&#8217;m talking about one particular case, that of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />. However, if you ask me which <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> I&#8217;m talking about, I have to admit that I&#8217;m not talking about a specific number like 103. Rather, I am making the general statement that would normally be written using a quantifier: for <em>every</em> natural number <img src='http://s0.wp.com/latex.php?latex=n%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n,' title='n,' class='latex' /> if <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is a prime greater than 2 then <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is odd.</li>
<li>This means that there are two notions of implication that it is important to be aware of and not to confuse. One, which I have called the truth-value notion, is the one mentioned above. The other, which I have called the &#8220;causal&#8221; notion, is best thought of as relating <em>properties</em> rather than statements. For instance, &#8220;is a prime greater than 2&#8243; implies (in the property sense) &#8220;is odd&#8221;.</li>
<li>The two notions of implication are related as follows. Let P and Q be two properties and write <img src='http://s0.wp.com/latex.php?latex=P%28n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(n)' title='P(n)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Q%28n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q(n)' title='Q(n)' class='latex' /> for the statements &#8220;<img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> has property P&#8221; and &#8220;<img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> has property Q&#8221;. Then the property P implies the property Q (in the more &#8220;causal&#8221;, property sense) if and only if for every <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> the statement <img src='http://s0.wp.com/latex.php?latex=P%28n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(n)' title='P(n)' class='latex' /> implies the statement <img src='http://s0.wp.com/latex.php?latex=Q%28n%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q(n)' title='Q(n)' class='latex' /> (in the truth-value sense). If you don&#8217;t feel you understand this point properly, I recommend not worrying about it for now, but coming back and rereading what I&#8217;ve written if at some stage you find yourself confused about &#8220;implies&#8221;.</li>
<li>Many English words and expressions contain inside them, sometimes not very explicitly, notions of &#8220;all&#8221; or &#8220;some&#8221;. For example, &#8220;There are always golf balls in the undergrowth over there,&#8221; could be translated into more formal language as follows: &#8220;For every time <img src='http://s0.wp.com/latex.php?latex=t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t' title='t' class='latex' /> there exists a golf ball <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=g&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='g' title='g' class='latex' /> is in the undergrowth over there at time <img src='http://s0.wp.com/latex.php?latex=t.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t.' title='t.' class='latex' />&#8220;</li>
<li>Note that the quantifiers go at the beginning of the sentence and the variables we are talking about are specifically mentioned. That is to avoid ambiguities such as the following: &#8220;Every element of the set <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is less than some number.&#8221; Does that mean that for each element <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> you can find a <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> that&#8217;s bigger than <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />? If so, then it&#8217;s not a very interesting thing to say. It&#8217;s more likely that the intended meaning is that there is some number <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> that is bigger than every element of the set <img src='http://s0.wp.com/latex.php?latex=A.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A.' title='A.' class='latex' /> This can be made crystal clear if you follow the rule about putting quantifiers first and specifying the variables. Then you get the statement, &#8220;There exists a number <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> such that for every element <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A,' title='x&#92;in A,' class='latex' /> <img src='http://s0.wp.com/latex.php?latex=x%3Cy.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;y.' title='x&lt;y.' class='latex' />&#8221; (Perhaps this particular sentence is even clearer in its symbolic form: <img src='http://s0.wp.com/latex.php?latex=%5Cexists+y%5Cin%5Cmathbb%7BR%7D%5C+%5Cforall+x%5Cin+A%5C+%5C+x%3Cy.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists y&#92;in&#92;mathbb{R}&#92; &#92;forall x&#92;in A&#92; &#92; x&lt;y.' title='&#92;exists y&#92;in&#92;mathbb{R}&#92; &#92;forall x&#92;in A&#92; &#92; x&lt;y.' class='latex' />)</li>
<li>The negation of <img src='http://s0.wp.com/latex.php?latex=%5Cexists+x%5Cin+A%5C+%5C+P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists x&#92;in A&#92; &#92; P(x)' title='&#92;exists x&#92;in A&#92; &#92; P(x)' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+A%5C+%5C+%5Cneg+P%28x%29%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in A&#92; &#92; &#92;neg P(x),' title='&#92;forall x&#92;in A&#92; &#92; &#92;neg P(x),' class='latex' /> and the negation of <img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+A%5C+P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in A&#92; P(x)' title='&#92;forall x&#92;in A&#92; P(x)' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=%5Cexists+x%5Cin+A%5C+%5C+%5Cneg+P%28x%29.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists x&#92;in A&#92; &#92; &#92;neg P(x).' title='&#92;exists x&#92;in A&#92; &#92; &#92;neg P(x).' class='latex' /> These two rules are similar to de Morgan&#8217;s laws, and the similarity is not a coincidence.</li>
<li>That isn&#8217;t surprising, as there are almost no coincidences in mathematics.</li>
<li>Using the rules for negating single quantifiers, we can negate complicated sentences with lots of quantifiers in a purely mechanical way. We start with <img src='http://s0.wp.com/latex.php?latex=%5Cneg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg' title='&#92;neg' class='latex' /> right outside the entire sentence, and we then drag it past the quantifiers, changing each <img src='http://s0.wp.com/latex.php?latex=%5Cforall&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall' title='&#92;forall' class='latex' /> into <img src='http://s0.wp.com/latex.php?latex=%5Cexists&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists' title='&#92;exists' class='latex' /> and each <img src='http://s0.wp.com/latex.php?latex=%5Cexists&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists' title='&#92;exists' class='latex' /> into <img src='http://s0.wp.com/latex.php?latex=%5Cforall.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall.' title='&#92;forall.' class='latex' /></li>
<li>That doesn&#8217;t finish the job, because we still have to negate the inner quantifier-free statement. To do that we have to use things like de Morgan&#8217;s laws. A particularly important case is when the inner statement takes the form <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q.' title='P&#92;implies Q.' class='latex' /> (Here, and this is very important, I am thinking of <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q' title='Q' class='latex' /> as statements that involve several variables that have also been involved in the quantifiers.) The negation of <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=P%5Cwedge%5Cneg+Q.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;wedge&#92;neg Q.' title='P&#92;wedge&#92;neg Q.' class='latex' /> Why? Because once we get inside all the quantifiers, we&#8217;re talking about the truth-value meaning of implication, and the condition for <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' /> to be false is precisely that <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> should be true and <img src='http://s0.wp.com/latex.php?latex=Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q' title='Q' class='latex' /> false.</li>
<li>A quick example. Suppose that <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is some special set of integers and consider the statement &#8220;<img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+A%5C+%5C+x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in A&#92; &#92; x' title='&#92;forall x&#92;in A&#92; &#92; x' class='latex' /> is odd <img src='http://s0.wp.com/latex.php?latex=%5Cimplies+x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;implies x' title='&#92;implies x' class='latex' /> is prime.&#8221; The negation of this is &#8220;<img src='http://s0.wp.com/latex.php?latex=%5Cexists+x%5Cin+A%5C+%5C+x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists x&#92;in A&#92; &#92; x' title='&#92;exists x&#92;in A&#92; &#92; x' class='latex' /> is odd and <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is not prime.&#8221;</li>
<li>It&#8217;s worth thinking about the meaning of &#8220;<img src='http://s0.wp.com/latex.php?latex=%5Cimplies&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;implies' title='&#92;implies' class='latex' />&#8221; in that previous example. A number&#8217;s being odd doesn&#8217;t cause it to be prime. However, the statement is saying that for every number <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> that belongs to the set <img src='http://s0.wp.com/latex.php?latex=A%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A,' title='A,' class='latex' /> it so happens that if <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is odd then it is also prime. This is very much the truth-value meaning of &#8220;<img src='http://s0.wp.com/latex.php?latex=%5Cimplies&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;implies' title='&#92;implies' class='latex' />&#8221; and not the &#8220;causal&#8221; meaning.</li>
<li>The <em>converse</em> of a statement <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' /> is the statement <img src='http://s0.wp.com/latex.php?latex=Q%5Cimplies+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q&#92;implies P' title='Q&#92;implies P' class='latex' />. Also, we often say that the converse of a statement of the form <img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+X%5C+%5C+P%28x%29%5Cimplies+Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in X&#92; &#92; P(x)&#92;implies Q(x)' title='&#92;forall x&#92;in X&#92; &#92; P(x)&#92;implies Q(x)' class='latex' /> is the statement <img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+X%5C+%5C+Q%28x%29%5Cimplies+P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in X&#92; &#92; Q(x)&#92;implies P(x)' title='&#92;forall x&#92;in X&#92; &#92; Q(x)&#92;implies P(x)' class='latex' />.</li>
<li>If a statement is true, that does <em>not</em> make its converse automatically true: it is very easy to think of true statements with false converses. Therefore, you should be careful not to confuse a statement with its converse.</li>
<li>However, there are many interesting examples of true statements that <em>do</em> have true converses. When this happens, it is very often easier to prove the implication in one direction than it is to prove it in the other.</li>
<li>The <em>contrapositive</em> of the statement <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' /> is the statement <img src='http://s0.wp.com/latex.php?latex=%5Cneg+Q%5Cimplies%5Cneg+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg Q&#92;implies&#92;neg P' title='&#92;neg Q&#92;implies&#92;neg P' class='latex' />. The contrapositive is equivalent to the original statement (in the sense that it is true if and only if the original statement is true), but it often turns out to be easier to prove.</li>
<li>The contrapositive of the converse of <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P%5Cimplies%5Cneg+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P&#92;implies&#92;neg Q' title='&#92;neg P&#92;implies&#92;neg Q' class='latex' />. It is sometimes easier to prove <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P%5Cimplies%5Cneg+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P&#92;implies&#92;neg Q' title='&#92;neg P&#92;implies&#92;neg Q' class='latex' /> than it is to prove <img src='http://s0.wp.com/latex.php?latex=Q%5Cimplies+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q&#92;implies P' title='Q&#92;implies P' class='latex' />.</li>
<li>If you need to prove the converse of <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' />, don&#8217;t accidentally prove that <img src='http://s0.wp.com/latex.php?latex=%5Cneg+Q%5Cimplies%5Cneg+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg Q&#92;implies&#92;neg P' title='&#92;neg Q&#92;implies&#92;neg P' class='latex' /> instead &#8230;</li>
<li>A <em>free variable</em> is one that is not quantified over. A <em>bound variable</em> is one that is quantified over. For example, in the statement <img src='http://s0.wp.com/latex.php?latex=%5Cexists+M%5Cin%5Cmathbb%7BR%7D%5C++%5Cforall+x%5Cin%5Cmathbb%7BR%7D%5C+%5C+%7Cf%28x%29%7C%5Cleq+M&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists M&#92;in&#92;mathbb{R}&#92;  &#92;forall x&#92;in&#92;mathbb{R}&#92; &#92; |f(x)|&#92;leq M' title='&#92;exists M&#92;in&#92;mathbb{R}&#92;  &#92;forall x&#92;in&#92;mathbb{R}&#92; &#92; |f(x)|&#92;leq M' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=M&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M' title='M' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> are bound variables and <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is free. (An intuitive test you can apply is this: of which variables does it make sense to say that the statement is telling us something about those variables? The above statement is telling us about <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> and it doesn&#8217;t make sense to say that it is telling us about <img src='http://s0.wp.com/latex.php?latex=M&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M' title='M' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />.)</li>
<li>Bound variables are a bit like the dummy variables that appear in expressions such as <img src='http://s0.wp.com/latex.php?latex=%5Csum_%7Bm%3D1%7D%5Enm%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sum_{m=1}^nm^2' title='&#92;sum_{m=1}^nm^2' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5C%7Bx%5Cin%5Cmathbb%7BR%7D%3Ax%5E2%5Cleq+a%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{x&#92;in&#92;mathbb{R}:x^2&#92;leq a&#92;}' title='&#92;{x&#92;in&#92;mathbb{R}:x^2&#92;leq a&#92;}' class='latex' />. (In the first expression <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> is a dummy variable and in the second one <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is a dummy variable.)</li>
<li>When you are writing a proof, always be clear in your mind &#8212; and in your write-up, which variables are free and which bound.</li>
<li>Always introduce your variables before talking about them.</li>
<li><em>Give names</em> to the objects you want to talk about. You&#8217;ll find it much easier to express yourself.</li>
<li>An important way of introducing a variable is to use the &#8220;let&#8221; trick and the &#8220;suppose&#8221; trick. For instance, if you want to prove a statement of the form <img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+X%5C+%5C+P%28x%29%5Cimplies+Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in X&#92; &#92; P(x)&#92;implies Q(x)' title='&#92;forall x&#92;in X&#92; &#92; P(x)&#92;implies Q(x)' class='latex' />, then without thinking you can begin your proof, &#8220;Let <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X' title='x&#92;in X' class='latex' /> and suppose that <img src='http://s0.wp.com/latex.php?latex=P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(x)' title='P(x)' class='latex' />.&#8221; Another way of putting it is, &#8220;Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> be an element of <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(x)' title='P(x)' class='latex' />.</li>
<li>If you have just proved a statement of the form <img src='http://s0.wp.com/latex.php?latex=%5Cexists+x%5C+%5C+P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists x&#92; &#92; P(x)' title='&#92;exists x&#92; &#92; P(x)' class='latex' /> then it is considered OK to go on and talk about <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> as though you had just chosen it &#8212; even if you haven&#8217;t. For example, it&#8217;s OK to write, &#8220;There exists <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;in B' title='f(x)&#92;in B' class='latex' />. It follows that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cin+B%5Ccup+C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;in B&#92;cup C' title='f(x)&#92;in B&#92;cup C' class='latex' />.&#8221; which is a kind of shorthand for, &#8220;There exists <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=f%28y%29%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(y)&#92;in B' title='f(y)&#92;in B' class='latex' />. Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> be such that <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;in B' title='f(x)&#92;in B' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=f%28x%29%5Cin+B%5Ccup+C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&#92;in B&#92;cup C' title='f(x)&#92;in B&#92;cup C' class='latex' />.&#8221;</li>
</ol>
<p>I have written the posts so far because I want to be able to take this kind of thing for granted when I discuss some of the definitions and proofs that are covered in this term&#8217;s courses. I don&#8217;t mean that I&#8217;ll never mention any of these basic logical principles again &#8212; far from it. But when I do mention them, I&#8217;ll assume that you have some familiarity with them, so that it will be enough for me to offer reminders rather than explaining everything from scratch.</p>
<p>If you don&#8217;t feel entirely confident that you&#8217;re on top of basic logic, there is something you can do that you couldn&#8217;t do in my day, which is an online test. Devised by Terence Tao, this test consists of simple questions whose sole purpose is to let you see whether you understand the kind of logic you need to handle complicated mathematical statements. (The traditional way was to be thrown in at the deep end: you just had to handle the complicated statements and if your logic was shaky your supervisor would tell you.)</p>
<p>Rather than give you a link directly to the logic questions, I&#8217;m giving a link to <a href="http://scherk.pbworks.com/w/page/14864181/FrontPage">a page with lots of quizzes on it</a>. This really is a wonderful resource: I strongly recommend that before you do an examples sheet on a topic covered in one of the quizzes here, you first do the quiz to check that you are secure on the basics. And I strongly recommend doing the quiz entitled &#8220;Logic&#8221; before you do any (pure) examples sheets at all.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3332/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3332/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3332/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3332/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3332/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3332/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3332/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3332&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/10/09/basic-logic-summary/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Basic logic &#8212; tips for handling variables</title>
		<link>http://gowers.wordpress.com/2011/10/07/basic-logic-tips-for-handling-variables/</link>
		<comments>http://gowers.wordpress.com/2011/10/07/basic-logic-tips-for-handling-variables/#comments</comments>
		<pubDate>Fri, 07 Oct 2011 10:49:36 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Basic logic]]></category>
		<category><![CDATA[Cambridge teaching]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3366</guid>
		<description><![CDATA[Roughly speaking, a variable is any letter you use to stand for an unknown object of a certain type. For example, if you write then and are variables. If you write, &#8220;Let be a subset of &#8221; then is a variable (it is an unknown set of a certain kind) whereas isn&#8217;t (it&#8217;s the name [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3366&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Roughly speaking, a <em>variable</em> is any letter you use to stand for an unknown object of a certain type. For example, if you write <img src='http://s0.wp.com/latex.php?latex=x%2By%3D20%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x+y=20,' title='x+y=20,' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> are variables. If you write, &#8220;Let <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> be a subset of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N},' title='&#92;mathbb{N},' class='latex' />&#8221; then <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is a variable (it is an unknown set of a certain kind) whereas <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' /> isn&#8217;t (it&#8217;s the name we give to the set of all positive integers). I suppose the definition I&#8217;ve just given isn&#8217;t quite perfect, since if I asked you to solve the simultaneous equations <img src='http://s0.wp.com/latex.php?latex=x%2By%3D8&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x+y=8' title='x+y=8' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x%2B3y%3D12%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x+3y=12,' title='x+3y=12,' class='latex' /> then one would normally call <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> variables even though their values are completely determined by the equations. Though even then one could say that they started out as &#8220;unknown&#8221;. </p>
<p>Just in case I&#8217;ve gone and confused the issue, let me try to clear it up instantly. It would be quite normal to say something like this: &#8220;Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> be two real numbers. Suppose that they satisfy the equations <img src='http://s0.wp.com/latex.php?latex=x%2By%3D8&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x+y=8' title='x+y=8' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=x%2B3y%3D12.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x+3y=12.' title='x+3y=12.' class='latex' /> Determine the values of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y.' title='y.' class='latex' />&#8221; It is then reasonable to call them variables, because when I started discussing them I gave no information about them whatever. I then went on to specify some relationships between <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y,' title='y,' class='latex' /> and it so happened that from those relationships it was possible to deduce the exact values of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y.' title='y.' class='latex' /><br />
<span id="more-3366"></span></p>
<p><strong>Free and bound variables.</strong></p>
<p>There is a simple, but incredibly important, distinction between two kinds of variables. It&#8217;s one that you will have come across if you are familiar with the <img src='http://s0.wp.com/latex.php?latex=%5CSigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;Sigma' title='&#92;Sigma' class='latex' /> notation for sums. Consider the expression <img src='http://s0.wp.com/latex.php?latex=%5Csum_%7Bm%3D1%7D%5Enm%5E2.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sum_{m=1}^nm^2.' title='&#92;sum_{m=1}^nm^2.' class='latex' /> That involves two variables, <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n.' title='n.' class='latex' /> But the role they play in the expression is very different indeed. To see the difference, ask yourself, for each variable, what difference it would make if you replaced it by a different variable. Let&#8217;s try it. I&#8217;ll start by replacing <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=r.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r.' title='r.' class='latex' /> I get <img src='http://s0.wp.com/latex.php?latex=%5Csum_%7Bm%3D1%7D%5Erm%5E2.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sum_{m=1}^rm^2.' title='&#92;sum_{m=1}^rm^2.' class='latex' /> So whereas before I was adding up the first <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> squares, now I&#8217;m adding up the first <img src='http://s0.wp.com/latex.php?latex=r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r' title='r' class='latex' /> squares, so if <img src='http://s0.wp.com/latex.php?latex=n%5Cne+r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;ne r' title='n&#92;ne r' class='latex' /> then I&#8217;ll be getting a different number. Now let&#8217;s instead replace <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=r.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r.' title='r.' class='latex' /> That gives us the expression <img src='http://s0.wp.com/latex.php?latex=%5Csum_%7Br%3D1%7D%5Enr%5E2.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sum_{r=1}^nr^2.' title='&#92;sum_{r=1}^nr^2.' class='latex' /> How does that differ from <img src='http://s0.wp.com/latex.php?latex=%5Csum_%7Bm%3D1%7D%5Enm%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sum_{m=1}^nm^2' title='&#92;sum_{m=1}^nm^2' class='latex' />? It <em>doesn&#8217;t</em>. It&#8217;s just another way of writing the sum of the first <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> squares. We call <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> a <em>free</em> variable (roughly speaking because we are free to choose a value for it) and <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> a <em>bound</em> variable, or <em>dummy</em> variable.</p>
<p>Here are two further ways of distinguishing between free and bound variables. The first is to ask yourself the question, &#8220;What value does this variable take?&#8221; If the question is sensible, then the variable is free, and if it&#8217;s a stupid question then the variable is bound. For instance, if I ask what the value of <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> is in the expression <img src='http://s0.wp.com/latex.php?latex=%5Csum_%7Bm%3D1%7D%5Enm%5E2%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sum_{m=1}^nm^2,' title='&#92;sum_{m=1}^nm^2,' class='latex' /> that is a stupid question: it&#8217;s just standing for something that goes from 1 to n. (The same phenomenon occurs in computer programs with FOR loops. If I write &#8220;FOR m=1 TO n DO such and such&#8221;, then <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> isn&#8217;t something you can substitute a value for, whereas <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is.) But if I ask what the value of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is, that&#8217;s not ridiculous at all: we might decide to set <img src='http://s0.wp.com/latex.php?latex=n%3D100&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=100' title='n=100' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=n%3D6k%2B2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=6k+2' title='n=6k+2' class='latex' /> for some other variable <img src='http://s0.wp.com/latex.php?latex=k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k' title='k' class='latex' /> that&#8217;s floating around, and so on.</p>
<p>The second way is to see whether you can rewrite the expression in a way that doesn&#8217;t mention the variable in question. For example, I can rewrite <img src='http://s0.wp.com/latex.php?latex=%5Csum_%7Bm%3D1%7D%5Enm%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sum_{m=1}^nm^2' title='&#92;sum_{m=1}^nm^2' class='latex' /> as <img src='http://s0.wp.com/latex.php?latex=1%5E2%2B2%5E2%2B%5Cdots%2Bn%5E2.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1^2+2^2+&#92;dots+n^2.' title='1^2+2^2+&#92;dots+n^2.' class='latex' /> This test doesn&#8217;t always work that well. For example, <img src='http://s0.wp.com/latex.php?latex=t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t' title='t' class='latex' /> is a dummy variable in the expression <img src='http://s0.wp.com/latex.php?latex=%5Cint_1%5Ex%5Csin+t+dt%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;int_1^x&#92;sin t dt,' title='&#92;int_1^x&#92;sin t dt,' class='latex' /> but it&#8217;s difficult to rewrite the expression without mentioning some variable that plays the role of <img src='http://s0.wp.com/latex.php?latex=t.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t.' title='t.' class='latex' /> In fact, it was difficult even in the summation case &#8212; I had to use dots and trust that you would know what I was talking about. Even so, the test may occasionally be helpful. </p>
<p>So far, I&#8217;ve talked about free and bound variables in expressions that stand for mathematical objects (in both cases numbers). However, the main theme of this section of the post is really free and bound variables in <em>statements</em>. Let me give an example.</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in A' title='x&#92;in A' class='latex' /> there exists <img src='http://s0.wp.com/latex.php?latex=y%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;in B' title='y&#92;in B' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=y%5Cleq+x%5Cleq+2y.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y&#92;leq x&#92;leq 2y.' title='y&#92;leq x&#92;leq 2y.' class='latex' /></li>
<p>Here I&#8217;m imagining that <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> are sets of real numbers. The statement is telling me that every element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> can be sandwiched between some element of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> and twice that element. </p>
<p>The statement above involved four variables, <img src='http://s0.wp.com/latex.php?latex=A%2C+B%2C+x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A, B, x' title='A, B, x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y.' title='y.' class='latex' /> I hope it is already obvious to you which ones are free and which are bound. In case it isn&#8217;t, just apply the test of seeing whether it makes a difference to what the statement is saying if you change a variable to something else. Then you will see that <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> are free variables and <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> are bound variables. Why? <em>Because the statement is about <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> and not about <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y.' title='y.' class='latex' /></em> (I recommend reading it and making sure not only that you understand what it means but also that you agree that it is saying something about <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />.) If you change <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C' title='C' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=D%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D,' title='D,' class='latex' /> then you are no longer saying that <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> are related in a certain way: you are saying that <img src='http://s0.wp.com/latex.php?latex=C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C' title='C' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D' title='D' class='latex' /> are related in that way. But if you say,</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=a%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;in A' title='a&#92;in A' class='latex' /> there exists <img src='http://s0.wp.com/latex.php?latex=b%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;in B' title='b&#92;in B' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=b%5Cleq+a%5Cleq+2b.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b&#92;leq a&#92;leq 2b.' title='b&#92;leq a&#92;leq 2b.' class='latex' /></li>
<p>then you are expressing exactly the same relationship between <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> as you were before.</p>
<p>An easy way to tell which variables are free and which are bound in certain types of sentences is just to look and see which ones appear in quantifiers and which don&#8217;t. For example, take the following, by now fairly familiar, statement.</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0' title='&#92;epsilon&gt;0' class='latex' /> there exists <img src='http://s0.wp.com/latex.php?latex=N%5Cin%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N&#92;in&#92;mathbb{N}' title='N&#92;in&#92;mathbb{N}' class='latex' /> such that for every <img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N,' title='n&#92;geq N,' class='latex' /> <img src='http://s0.wp.com/latex.php?latex=%7Ca_n-a%7C%3C%5Cepsilon.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a_n-a|&lt;&#92;epsilon.' title='|a_n-a|&lt;&#92;epsilon.' class='latex' /></li>
<p>Four of the variables that appear are <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%2C+N%2C+n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon, N, n' title='&#92;epsilon, N, n' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=a.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a.' title='a.' class='latex' /> The status of <img src='http://s0.wp.com/latex.php?latex=a_n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_n' title='a_n' class='latex' /> is a little less easy to describe: I&#8217;ll come back to it in a moment. Now <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon' title='&#92;epsilon' class='latex' /> appears inside a quantifier, since the statement begins, &#8220;For every <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0' title='&#92;epsilon&gt;0' class='latex' />.&#8221; Similarly, <img src='http://s0.wp.com/latex.php?latex=N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N' title='N' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> appear inside quantifiers. But <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> doesn&#8217;t. Therefore, <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%2C+N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon, N' title='&#92;epsilon, N' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> are bound, but <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> is free.</p>
<p>What about <img src='http://s0.wp.com/latex.php?latex=a_n.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_n.' title='a_n.' class='latex' /> A good way to understand its role is to write it instead as <img src='http://s0.wp.com/latex.php?latex=f%28n%29.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(n).' title='f(n).' class='latex' /> That is, we treat the sequence <img src='http://s0.wp.com/latex.php?latex=a_1%2Ca_2%2Ca_3%2C%5Cdots&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_1,a_2,a_3,&#92;dots' title='a_1,a_2,a_3,&#92;dots' class='latex' /> as a function <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> that takes integers and turns them into real numbers. With this new notation the sentence would be rewritten</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0' title='&#92;epsilon&gt;0' class='latex' /> there exists <img src='http://s0.wp.com/latex.php?latex=N%5Cin%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N&#92;in&#92;mathbb{N}' title='N&#92;in&#92;mathbb{N}' class='latex' /> such that for every <img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N,' title='n&#92;geq N,' class='latex' /> <img src='http://s0.wp.com/latex.php?latex=%7Cf%28n%29-a%7C%3C%5Cepsilon.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|f(n)-a|&lt;&#92;epsilon.' title='|f(n)-a|&lt;&#92;epsilon.' class='latex' /></li>
<p>Now we can simply say that <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> is a variable: it stands for an unknown function from the positive integers to the real numbers. Since we have not quantified over <img src='http://s0.wp.com/latex.php?latex=f%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f,' title='f,' class='latex' /> it is a free variable. </p>
<p>If that analysis is correct, then it should be the case that the sentence above is telling us about <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=a%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a,' title='a,' class='latex' /> while <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%2C+N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon, N' title='&#92;epsilon, N' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> are just placeholders. And indeed that is the case. When we say that a sequence converges to a limit, we are talking about the sequence and the limit, and not all the other variables that come in when we write out the definition in full. </p>
<p>Note that whether or not a variable is free depends very much on the statement that you regard it as being part of. For instance, suppose I write the following: &#8220;Let <img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N.' title='n&#92;geq N.' class='latex' /> Then by the above calculation we see that <img src='http://s0.wp.com/latex.php?latex=%7Ca_n-a%7C%3C%5Cepsilon.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a_n-a|&lt;&#92;epsilon.' title='|a_n-a|&lt;&#92;epsilon.' class='latex' />&#8221; If I regard <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> as part of the second sentence only, then it is free. But if I regard it as part of both sentences, and if I regard those as a way of saying, &#8220;For every <img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N,' title='n&#92;geq N,' class='latex' /> [the above calculation shows that] <img src='http://s0.wp.com/latex.php?latex=%7Ca_n-a%7C%3C%5Cepsilon%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a_n-a|&lt;&#92;epsilon,' title='|a_n-a|&lt;&#92;epsilon,' class='latex' />&#8221; then we are quantifying over <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> so it is bound. In a funny way, the word &#8220;Let&#8221; could be said to &#8220;liberate&#8221; <img src='http://s0.wp.com/latex.php?latex=n.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n.' title='n.' class='latex' /> (The phrase &#8220;with one bound he was free&#8221; comes to mind, but that really does confuse things.)</p>
<p>I said in the title that I was going to offer some tips for handling variables. So here&#8217;s one.</p>
<li><em>Always be completely sure in your mind which variables are free and which are bound.</em></li>
<p>But actually, the real message of this post is more basic.</p>
<li><em>Always introduce your variables to the reader before going on to talk about them.</em></li>
<p>In this small respect, you should treat your variables like people. Suppose that you had two friends, Anne and David, who had never met. You wouldn&#8217;t begin by saying to David, &#8220;Anne went to Amsterdam a few months ago.&#8221; Only once you&#8217;d said, &#8220;This is Anne,&#8221; or words to that effect, would imparting information about Anne be appropriate social behaviour. I have often read supervision work, not really understood what has been written, and felt moved to ask something like, &#8220;What is <img src='http://s0.wp.com/latex.php?latex=%5Cdelta&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;delta' title='&#92;delta' class='latex' />?&#8221; in roughly the tone of voice that David might ask, &#8220;Who&#8217;s Anne?&#8221; if you launched in with information about her last holiday.</p>
<p>How do you introduce a variable? Let me illustrate by example. First, a definition: a function that takes real numbers to real numbers is called <em>strictly increasing</em> if &#8230;</p>
<p>I&#8217;ve got to that point in my sentence and realized that it is rather difficult to say what I mean unless I give the function a name. Here&#8217;s what might have happened if I had struggled on with the sentence I was in the middle of writing.</p>
<li>A function that takes real numbers to real numbers is called <em>strictly increasing</em> if whenever you apply it to two real numbers, one of which is greater than the other, then the value it takes at the greater number is greater than the value it takes at the smaller number.</li>
<p>Here&#8217;s a <em>much</em> clearer way of saying the same thing.</p>
<li>A function <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> is called <em>strictly increasing</em> if for every pair of real numbers <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />, if <img src='http://s0.wp.com/latex.php?latex=x%3Cy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;y' title='x&lt;y' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Cf%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&lt;f(y)' title='f(x)&lt;f(y)' class='latex' />.</li>
<p>I could have written that in a slightly less formal way as follows.</p>
<li>A function <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> from <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{R}' title='&#92;mathbb{R}' class='latex' /> is called <em>strictly increasing</em> if <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Cf%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&lt;f(y)' title='f(x)&lt;f(y)' class='latex' /> whenever <img src='http://s0.wp.com/latex.php?latex=x%3Cy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;y' title='x&lt;y' class='latex' />.</li>
<p>That is less formal because I didn&#8217;t specify what <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> were (leaving it to the context to make it clear that they were real numbers) and I used the word &#8220;whenever&#8221; (leaving it to the reader to work out how to convert that into a statement involving a universal quantifier). However, I am much more concerned with the difference between these last two formulations of the definition and the first one. And that difference is that I followed another tip for dealing with variables:</p>
<li><em>Give names to the things you are talking about.</em></li>
<p>If you do that, then you nearly always convert clumsy, wordy sentences into much cleaner ones.</p>
<p>Incidentally, this piece of advice very much depends on our modern practice of using letters to stand for whatever we feel like making them stand for. Before this practice was invented, nobody knew of any way of expressing mathematical statements apart from what I have been calling the clumsy, wordy way. For example, this is how the sixth proposition from Book II of <a href="http://en.wikipedia.org/wiki/Euclid's_Elements">Euclid&#8217;s <em>Elements</em></a> was stated (in translation of course, but the point still stands).</p>
<blockquote><p>If a straight line be bisected and a straight line be added to it in a straight line, the rectangle contained by the whole with the added straight line and the added straight line together with the square on the half is equal to the square on the straight line made up of the half and the added straight line.</p></blockquote>
<p>If you have the faintest idea what that&#8217;s saying, then you&#8217;re doing better than me. Personally, I find a sentence like that more or less impossible to understand. And if I do want to understand it, I have to translate it. Fortunately, we know how to avoid that style now, so please avoid it.</p>
<p>Back to what I was really talking about, which was the principle that you should introduce your variables before you talk about them. As I&#8217;ve just been saying, a function <img src='http://s0.wp.com/latex.php?latex=f&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f' title='f' class='latex' /> from the reals to the reals is <em>strictly increasing</em> if <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Cf%28y%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)&lt;f(y)' title='f(x)&lt;f(y)' class='latex' /> whenever <img src='http://s0.wp.com/latex.php?latex=x%3Cy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;y' title='x&lt;y' class='latex' />. Imagine now that you had a question on an examples sheet that asked you to prove that the function <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dx%5E3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=x^3' title='f(x)=x^3' class='latex' /> is strictly increasing. (By the way, a quick aside. You may notice that I wrote &#8220;examples sheet&#8221; and that almost all the sheets your lecturers hand out have at the top the words &#8220;example sheet&#8221;. In my day they had at the top the words &#8220;examples sheet&#8221;, for the simple reason that each sheet was a sheet of examples. I get irritated with the phrase &#8220;example sheet&#8221;, but as I write this I realize that if you replace the word &#8220;example&#8221; with &#8220;problem&#8221; or &#8220;question&#8221; then the plural seems utterly weird. In fact, I can&#8217;t think of a single example where &#8220;An X of Ys&#8221; would become &#8220;A Ys X&#8221; rather than &#8220;A Y X&#8221;. Somehow that&#8217;s even more irritating. And somehow &#8220;examples sheet&#8221; <em>still</em> feels right to me.) </p>
<p>OK, how do we show that the function <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dx%5E3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=x^3' title='f(x)=x^3' class='latex' /> is strictly increasing? Here&#8217;s how <em>not</em> to begin your argument.</p>
<li>Since <img src='http://s0.wp.com/latex.php?latex=x%3Cy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;y' title='x&lt;y' class='latex' /> we know that <img src='http://s0.wp.com/latex.php?latex=y-x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y-x' title='y-x' class='latex' /> is positive.</li>
<p>If you write that and I ever see what you&#8217;ve written, it will be a pure reflex for me to ask, &#8220;WHAT ARE <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> AND <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />?&#8221; If you think that that&#8217;s ridiculously pedantic and that the context makes it obvious that you have chosen two real numbers <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=x%3Cy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;y' title='x&lt;y' class='latex' />, I would respond as follows.</p>
<p>(i) It is true that I can tell that that is the context that you have set up in your brain before writing what you wrote.</p>
<p>(ii) It is also true that that is the only context I can think of that makes sense of what you wrote.</p>
<p>(iii) Nevertheless, you have not explained the context.</p>
<p>(iv) If you get into the habit of not explaining the context, then you will run into difficulties when the proofs get a little bit more complicated.</p>
<p>Point (iv) is the most important. As soon as you start needing to write proofs that involve definitions that have strings of two or three quantifiers, if you don&#8217;t say what your variables are doing then you&#8217;ll get into a mess.</p>
<p>Back to this proof. What should one write instead? Something more like this.</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> be real numbers with <img src='http://s0.wp.com/latex.php?latex=x%3Cy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;y' title='x&lt;y' class='latex' />. [That's the "This is Anne" part.] Then <img src='http://s0.wp.com/latex.php?latex=y-x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y-x' title='y-x' class='latex' /> is positive. [By the way, Anne has been to Amsterdam recently.]</li>
<p>It may seem so obvious to you that the function <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3Dx%5E3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)=x^3' title='f(x)=x^3' class='latex' /> is increasing that you&#8217;re not quite sure how to prove it. What I want to do is deduce this simple fact from very basic principles to do with how inequalities interact with addition and multiplication. The main two are these, which hold for any three real numbers <img src='http://s0.wp.com/latex.php?latex=a%2C+b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a, b' title='a, b' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=c&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c' title='c' class='latex' />.</p>
<li>If <img src='http://s0.wp.com/latex.php?latex=a%3Cb&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&lt;b' title='a&lt;b' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=a%2Bc%3Cb%2Bc&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a+c&lt;b+c' title='a+c&lt;b+c' class='latex' />. [You can add anything you like to both sides of an inequality.]</li>
<li>If <img src='http://s0.wp.com/latex.php?latex=a%3Cb&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&lt;b' title='a&lt;b' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=c%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c&gt;0' title='c&gt;0' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=ac%3Cbc&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ac&lt;bc' title='ac&lt;bc' class='latex' />. [You can multiply both sides of an inequality by a positive number.]</li>
<p>Let me indicate two arguments. The first is a bit crude but it gets the job done. </p>
<p>If <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> are both positive, then </p>
<p><img src='http://s0.wp.com/latex.php?latex=x%5E3%3Dxx%5E2%3Cyx%5E2%3Dx%28yx%29%3Cy%28yx%29%3Dxy%5E2%3Cyy%5E2%3Dy%5E3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x^3=xx^2&lt;yx^2=x(yx)&lt;y(yx)=xy^2&lt;yy^2=y^3' title='x^3=xx^2&lt;yx^2=x(yx)&lt;y(yx)=xy^2&lt;yy^2=y^3' class='latex' />.</p>
<p>There I made repeated use of the second principle above. I also used the fact that <img src='http://s0.wp.com/latex.php?latex=x%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x^2' title='x^2' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=yx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='yx' title='yx' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y^2' title='y^2' class='latex' /> are all positive, which can be deduced from the second principle too.</p>
<p>If <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> are negative, we can use the fact that <img src='http://s0.wp.com/latex.php?latex=-x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='-x' title='-x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=-y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='-y' title='-y' class='latex' /> are positive and the fact that <img src='http://s0.wp.com/latex.php?latex=x%5E3%3D-%28-x%29%5E3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x^3=-(-x)^3' title='x^3=-(-x)^3' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y%5E3%3D-%28-y%29%5E3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y^3=-(-y)^3' title='y^3=-(-y)^3' class='latex' /> to deduce that <img src='http://s0.wp.com/latex.php?latex=x%5E3%3Cy%5E3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x^3&lt;y^3' title='x^3&lt;y^3' class='latex' /> from what we have just proved. And if one of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> is zero, then we can use the fact that the other is positive (if <img src='http://s0.wp.com/latex.php?latex=x%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=0' title='x=0' class='latex' />) or negative (if <img src='http://s0.wp.com/latex.php?latex=y%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y=0' title='y=0' class='latex' />). If one is positive and the other negative, we can use the zero case to prove what we want in two steps.</p>
<p>I won&#8217;t give the full details of that argument, because there is a cleaner argument that does it all in one go. Note first that <img src='http://s0.wp.com/latex.php?latex=y%5E3-x%5E3%3D%28y-x%29%28y%5E2%2Bxy%2Bx%5E2%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y^3-x^3=(y-x)(y^2+xy+x^2)' title='y^3-x^3=(y-x)(y^2+xy+x^2)' class='latex' />. Now <img src='http://s0.wp.com/latex.php?latex=y-x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y-x' title='y-x' class='latex' /> is positive, by assumption. As for the second bracket, it equals <img src='http://s0.wp.com/latex.php?latex=%28y%5E2%2Bx%5E2%2B%28y%2Bx%29%5E2%29%2F2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(y^2+x^2+(y+x)^2)/2' title='(y^2+x^2+(y+x)^2)/2' class='latex' />, which is positive because the square of any number is non-negative and <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> are not both equal to 0. (Why is the square of any number non-negative? I leave that to you as an exercise.)</p>
<p>Perhaps by this point you are getting cross with me because you think I should have just differentiated. There are all sorts of answers to that, but the main one is that I simply don&#8217;t like using calculus when there&#8217;s an elementary calculus-free argument around. Another is that the argument via differentiation is slightly more complicated than you might think. In any case, the point of this whole discussion was not the proof itself but the fact that I began it with, &#8220;Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> be real numbers with <img src='http://s0.wp.com/latex.php?latex=x%3Cy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;y' title='x&lt;y' class='latex' />.&#8221; The word &#8220;let&#8221; is incredibly useful for the purpose of introducing variables. It is typically used in two situations.</p>
<p><em>Situation 1.</em> You want to prove a statement about every element <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> of some set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />. You begin your argument with, &#8220;Let <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X' title='x&#92;in X' class='latex' />.&#8221; </p>
<p>That&#8217;s the situation we&#8217;ve just met. There I wanted to prove something about every pair of real numbers with the first less than the second. So I started the proof, &#8220;Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> be real numbers with <img src='http://s0.wp.com/latex.php?latex=x%3Cy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&lt;y' title='x&lt;y' class='latex' />.&#8221;</p>
<p><em>Situation 2.</em> You have just established that some object <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> exists with a property <img src='http://s0.wp.com/latex.php?latex=P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(x)' title='P(x)' class='latex' />. You follow that up with, &#8220;Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> be a [insert name for type of object that <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> is] such that <img src='http://s0.wp.com/latex.php?latex=P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(x)' title='P(x)' class='latex' />.&#8221;</p>
<p>Let me give an example of situation 2. Recall the result I mentioned in an earlier post, that if <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> are two positive integers that fail to generate the whole of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}' title='&#92;mathbb{Z}' class='latex' /> then they must have a common factor greater than 1. Let&#8217;s suppose that I&#8217;m in the middle of a proof and I find that the only way that my argument can fail is if there is some integer <img src='http://s0.wp.com/latex.php?latex=r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r' title='r' class='latex' /> that cannot be written in the form <img src='http://s0.wp.com/latex.php?latex=am%2Bbn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='am+bn' title='am+bn' class='latex' />. My proof might continue as follows.</p>
<li>Since <img src='http://s0.wp.com/latex.php?latex=r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r' title='r' class='latex' /> cannot be written in the form <img src='http://s0.wp.com/latex.php?latex=am%2Bbn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='am+bn' title='am+bn' class='latex' />, it follows that <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> have a common factor greater than 1. <em>Let <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> be such a factor.</em>[Proceed to talk about <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' />.]</li>
<p>Sometimes &#8212; in fact, extremely often &#8212; in this situation, mathematicians do something a bit sneaky. Instead of writing the careful introduction of <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> that I&#8217;ve given above, they write something more like this.</p>
<li>Since <img src='http://s0.wp.com/latex.php?latex=r&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r' title='r' class='latex' /> cannot be written in the form <img src='http://s0.wp.com/latex.php?latex=am%2Bbn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='am+bn' title='am+bn' class='latex' />, it follows that <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> have a common factor <img src='http://s0.wp.com/latex.php?latex=d%3E1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d&gt;1' title='d&gt;1' class='latex' />. [Proceed to talk about <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' />.]</li>
<p>Strictly speaking, this second way of writing is badly incorrect because in the first sentence <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> is a bound variable (because the sentence is effectively saying, &#8220;There exists <img src='http://s0.wp.com/latex.php?latex=d%3E1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d&gt;1' title='d&gt;1' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> is a factor of both <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />.&#8221;) but when one goes on to talk about <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> it has magically become a free variable. But when a linguistic practice becomes sufficiently widespread, it makes no sense to call it incorrect. It is better to regard the above as a convenient shorthand: if we pass from existentially quantifying over a variable <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> in one sentence to talking about <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> in the next sentence, it should be understood that what we really mean is that we existentially quantify over a <em>different</em> variable, then say, &#8220;Let <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> be [an example of what we've just shown to exist]&#8220;, and then proceed to talk about <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />. As long as you know exactly what you are doing, then this is OK.</p>
<p>One further remark. If you establish the existence of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(x)' title='P(x)' class='latex' /> and then go off and discuss something else for a while, then it will be quite confusing if later on in the argument you treat <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> as a free variable. So the shorthand above is probably best kept for situations where you prove that something exists and then immediately go on to talk about it (where by &#8220;it&#8221; I really mean one of the many possible examples).</p>
<p>Let me end with an exercise. Out of all the variables in the following sentence, which are free and which are bound? (The variables are <img src='http://s0.wp.com/latex.php?latex=A%2Cx%2C%5Cepsilon%2Cf&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A,x,&#92;epsilon,f' title='A,x,&#92;epsilon,f' class='latex' />, and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />.)</p>
<li><img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is a subset of the set of all <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> such that there exists <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0' title='&#92;epsilon&gt;0' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%7Cf%28y%29-f%28x%29%7C%3C1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|f(y)-f(x)|&lt;1' title='|f(y)-f(x)|&lt;1' class='latex' /> whenever <img src='http://s0.wp.com/latex.php?latex=%7Cy-x%7C%3C%5Cepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|y-x|&lt;&#92;epsilon' title='|y-x|&lt;&#92;epsilon' class='latex' />.</li>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3366/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3366&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/10/07/basic-logic-tips-for-handling-variables/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Basic logic &#8212; relationships between statements &#8212; converses and contrapositives</title>
		<link>http://gowers.wordpress.com/2011/10/05/basic-logic-relationships-between-statements-converses-and-contrapositives/</link>
		<comments>http://gowers.wordpress.com/2011/10/05/basic-logic-relationships-between-statements-converses-and-contrapositives/#comments</comments>
		<pubDate>Wed, 05 Oct 2011 10:28:13 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Basic logic]]></category>
		<category><![CDATA[Cambridge teaching]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3346</guid>
		<description><![CDATA[Converses. What is the relationship between the following two statements? 1. If is 1 or a prime number, then is divisible by . 2. If is divisible by then is 1 or a prime number. At first sight, this doesn&#8217;t look a very difficult question: the first statement is of the form and the second [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3346&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Converses.</strong></p>
<p>What is the relationship between the following two statements?</p>
<p>1. If <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is 1 or a prime number, then <img src='http://s0.wp.com/latex.php?latex=%28n-1%29%21%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n-1)!+1' title='(n-1)!+1' class='latex' /> is divisible by <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />.</p>
<p>2. If <img src='http://s0.wp.com/latex.php?latex=%28n-1%29%21%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n-1)!+1' title='(n-1)!+1' class='latex' /> is divisible by <img src='http://s0.wp.com/latex.php?latex=n%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n,' title='n,' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is 1 or a prime number.</p>
<p>At first sight, this doesn&#8217;t look a very difficult question: the first statement is of the form <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' /> and the second is of the form <img src='http://s0.wp.com/latex.php?latex=Q%5Cimplies+P.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q&#92;implies P.' title='Q&#92;implies P.' class='latex' /> We say that the second statement is the <em>converse</em> of the first. (Note that the first statement is also the converse of the second.)<br />
<span id="more-3346"></span></p>
<p>However, if you have read the previous posts in this series, I hope you will be slightly anxious about what I have just written, because I have not been clear about whether the above two statements are referring to some specific <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> or whether they are meant as general statements about all positive integers <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />. The latter is, I would suggest, the natural interpretation. And if you look at the kinds of converses that come up in a maths course, you will find that they are almost always not of specific statements, but rather of general statements.</p>
<p>This means that just as there are two notions of implication, there are also two notions of converse. One is the one I mentioned just a moment ago: the converse of <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=Q%5Cimplies+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q&#92;implies P' title='Q&#92;implies P' class='latex' />. The other involves a universal quantifier: we take a statement of the form <img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+X%5C+%5C+P%28x%29%5Cimplies+Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in X&#92; &#92; P(x)&#92;implies Q(x)' title='&#92;forall x&#92;in X&#92; &#92; P(x)&#92;implies Q(x)' class='latex' /> and its converse is <img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+X%5C+%5C+Q%28x%29%5Cimplies+P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in X&#92; &#92; Q(x)&#92;implies P(x)' title='&#92;forall x&#92;in X&#92; &#92; Q(x)&#92;implies P(x)' class='latex' />. (We don&#8217;t have to have just one variable here. For instance, the converse of <img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+X%5C+%5C+%5Cforall+y%5Cin+Y%5C+%5C+P%28x%2Cy%29%5Cimplies+Q%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in X&#92; &#92; &#92;forall y&#92;in Y&#92; &#92; P(x,y)&#92;implies Q(x,y)' title='&#92;forall x&#92;in X&#92; &#92; &#92;forall y&#92;in Y&#92; &#92; P(x,y)&#92;implies Q(x,y)' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+X%5C+%5C+%5Cforall+y%5Cin+Y%5C+%5C+Q%28x%2Cy%29%5Cimplies+P%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in X&#92; &#92; &#92;forall y&#92;in Y&#92; &#92; Q(x,y)&#92;implies P(x,y)' title='&#92;forall x&#92;in X&#92; &#92; &#92;forall y&#92;in Y&#92; &#92; Q(x,y)&#92;implies P(x,y)' class='latex' />.)</p>
<p>This is a potential source of confusion, since the definition of converse is often given as the first one (that the converse of <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' /> is <img src='http://s0.wp.com/latex.php?latex=Q%5Cimplies+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q&#92;implies P' title='Q&#92;implies P' class='latex' />), while almost all the interesting converses that actually come up in mathematics are of the second kind (where we universally quantify over one or more variables). </p>
<p>With that point behind us, let me discuss two examples of converses. </p>
<p><strong>Example 1.</strong></p>
<p>I&#8217;ll start with a simple one. Suppose you want to prove that two sets <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> are equal. The following is the criterion for this to be the case: every element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=B%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B,' title='B,' class='latex' /> and every element of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=A.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A.' title='A.' class='latex' /> For convenience, let&#8217;s suppose that we know that <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> are both subsets of a set <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' />. (For example, we might know that <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> consist of positive integers. In that case, <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> would be <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{N}' title='&#92;mathbb{N}' class='latex' />.) Then the statement that <img src='http://s0.wp.com/latex.php?latex=A%3DB&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A=B' title='A=B' class='latex' /> would break down into two statements that we could write using quantifiers in the following way.</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+X%5C+%5C+x%5Cin+A%5Cimplies+x%5Cin+B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in X&#92; &#92; x&#92;in A&#92;implies x&#92;in B' title='&#92;forall x&#92;in X&#92; &#92; x&#92;in A&#92;implies x&#92;in B' class='latex' /></li>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+X%5C+%5C+x%5Cin+B%5Cimplies+x%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in X&#92; &#92; x&#92;in B&#92;implies x&#92;in A' title='&#92;forall x&#92;in X&#92; &#92; x&#92;in B&#92;implies x&#92;in A' class='latex' /></li>
<p>The first of these tells us that every element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' />, and the second that every element of <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> is an element of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' />. And those two statements are converses of each other.</p>
<p><strong>Example 2.</strong></p>
<p>Let <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> be two integers (not necessarily positive). Let us define <em>the set of integers generated by </em> <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> <em>and</em> <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> to be the set of all integers of the form <img src='http://s0.wp.com/latex.php?latex=am%2Bbn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='am+bn' title='am+bn' class='latex' />, where <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> are themselves integers. We could call this the set of all &#8220;integer combinations&#8221; of <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />. An important question that comes up in the number theory that you will do this term is this: under what circumstances is the set generated by <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> equal to the set of all integers? That is, when can you make any integer you want out of <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> by adding appropriate multiples? </p>
<p>To get some traction on this problem, it might be worth looking at some examples. Suppose that <img src='http://s0.wp.com/latex.php?latex=m%3D40&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m=40' title='m=40' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n%3D70&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=70' title='n=70' class='latex' />. Can we make all integers by adding multiples of <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />? Quite clearly not, since every integer we can make will be a multiple of 10. Obviously this example can be generalized: if <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> have a common factor <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> that&#8217;s bigger than 1, then they cannot generate the whole of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}' title='&#92;mathbb{Z}' class='latex' /> (since everything that they generate will be a multiple of <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> and not all integers are multiples of <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' />). </p>
<p>Let us express this observation more formally. </p>
<li>For every pair of integers <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, if <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> have a common factor greater than 1, then <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> do not generate all of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}' title='&#92;mathbb{Z}' class='latex' />.</li>
<p>Note that this statement is of the general form</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cforall+m%2Cn%5Cin%5Cmathbb%7BZ%7D%5C+%5C+P%28m%2Cn%29%5Cimplies+Q%28m%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall m,n&#92;in&#92;mathbb{Z}&#92; &#92; P(m,n)&#92;implies Q(m,n)' title='&#92;forall m,n&#92;in&#92;mathbb{Z}&#92; &#92; P(m,n)&#92;implies Q(m,n)' class='latex' /></li>
<p>As such, it has a converse, obtained by reversing the roles of the statements <img src='http://s0.wp.com/latex.php?latex=P%28m%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(m,n)' title='P(m,n)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Q%28m%2Cn%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q(m,n)' title='Q(m,n)' class='latex' />. In words, the converse is as follows.</p>
<li>For every pair of integers <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, if <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> do not generate the whole of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}' title='&#92;mathbb{Z}' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> have a common factor that is bigger than 1.</li>
<p>Now converses of true statements don&#8217;t have to be true. For example, the statement, &#8220;Every multiple of 10 is even,&#8221; is true, and its converse, &#8220;Every even number is a multiple of 10,&#8221; is false. However, the converse I have just written down happens to be true: it really is the case that if two integers don&#8217;t generate the whole of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}' title='&#92;mathbb{Z}' class='latex' /> then they must have a common factor bigger than 1. If you are not familiar with this fact, then I recommend pausing for a few seconds just to think how you might prove it.<br />
.<br />
.<br />
.<br />
.<br />
And unless you are very unusually quick, you will have come to the conclusion that it is not at all obvious how to prove it. This may seem slightly mysterious: the two properties, &#8220;do not generate all of <img src='http://s0.wp.com/latex.php?latex=%5Cmathbb%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;mathbb{Z}' title='&#92;mathbb{Z}' class='latex' />&#8221; and &#8220;have a common factor bigger than 1&#8243; are equivalent, so in a sense the same property, and yet the second feels stronger than the first, in the sense that the first is easy to deduce from the second, while the second is not so easy to deduce from the first. I discussed this phenomenon <a href="http://gowers.wordpress.com/2008/12/28/how-can-one-equivalent-statement-be-stronger-than-another/">in a blog post a few years ago</a>, which provoked a number of very interesting comments. </p>
<p><strong>Contrapositives.</strong> </p>
<p>Let <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q' title='Q' class='latex' /> be two statements. Then the statement <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' /> is true (here I&#8217;m using the truth-value definition) if and only if it is not the case that <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> is true and <img src='http://s0.wp.com/latex.php?latex=Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q' title='Q' class='latex' /> is false. </p>
<p>When is <img src='http://s0.wp.com/latex.php?latex=%5Cneg+Q%5Cimplies%5Cneg+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg Q&#92;implies&#92;neg P' title='&#92;neg Q&#92;implies&#92;neg P' class='latex' /> true? Applying the same rule, the only thing that could stop it being true is if <img src='http://s0.wp.com/latex.php?latex=%5Cneg+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg Q' title='&#92;neg Q' class='latex' /> is true and <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P' title='&#92;neg P' class='latex' /> is false. But that&#8217;s the same as saying that <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> is true and <img src='http://s0.wp.com/latex.php?latex=Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q' title='Q' class='latex' /> is false. In other words, the conditions for <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' /> to be true are precisely the same as the conditions for <img src='http://s0.wp.com/latex.php?latex=%5Cneg+Q%5Cimplies%5Cneg+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg Q&#92;implies&#92;neg P' title='&#92;neg Q&#92;implies&#92;neg P' class='latex' /> to be true. The statement <img src='http://s0.wp.com/latex.php?latex=%5Cneg+Q%5Cimplies%5Cneg+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg Q&#92;implies&#92;neg P' title='&#92;neg Q&#92;implies&#92;neg P' class='latex' /> is called the <em>contrapositive</em> of the statement <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' />. A statement and its contrapositive are always equivalent, but the importance of the contrapositive is that it is sometimes easier to prove than the original statement.</p>
<p>As ever, the situation is more interesting if we quantify over a parameter involved in a statement. Consider a statement of the form</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X%5C+%5C+P%28x%29%5Cimplies+Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X&#92; &#92; P(x)&#92;implies Q(x)' title='x&#92;in X&#92; &#92; P(x)&#92;implies Q(x)' class='latex' />.</li>
<p>For each <img src='http://s0.wp.com/latex.php?latex=x%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x,' title='x,' class='latex' /> the inner part of this statement is equivalent to <img src='http://s0.wp.com/latex.php?latex=%5Cneg+Q%28x%29%5Cimplies%5Cneg+P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg Q(x)&#92;implies&#92;neg P(x)' title='&#92;neg Q(x)&#92;implies&#92;neg P(x)' class='latex' />. Therefore, the entire statement is equivalent to</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X%5C+%5C+%5Cneg+Q%28x%29%5Cimplies%5Cneg+P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X&#92; &#92; &#92;neg Q(x)&#92;implies&#92;neg P(x)' title='x&#92;in X&#92; &#92; &#92;neg Q(x)&#92;implies&#92;neg P(x)' class='latex' />.</li>
<p>That was a bit abstract, so let me look at a simple example. A surprisingly useful way of proving that two real numbers <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> are equal is to show that their difference is less than all positive numbers. That is, we rely on the following principle.</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x%2Cy%5Cin%5Cmathbb%7BR%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x,y&#92;in&#92;mathbb{R}' title='x,y&#92;in&#92;mathbb{R}' class='latex' />, if for every <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0%5C+%5C+%7Cx-y%7C%3C%5Cepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0&#92; &#92; |x-y|&lt;&#92;epsilon' title='&#92;epsilon&gt;0&#92; &#92; |x-y|&lt;&#92;epsilon' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=x%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=y' title='x=y' class='latex' />.</li>
<p>The statement I have just written is of the general form</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%2Cy%5Cin%5Cmathbb%7BR%7D%5C+%5C+P%28x%2Cy%29%5Cimplies+Q%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x,y&#92;in&#92;mathbb{R}&#92; &#92; P(x,y)&#92;implies Q(x,y)' title='&#92;forall x,y&#92;in&#92;mathbb{R}&#92; &#92; P(x,y)&#92;implies Q(x,y)' class='latex' />.</li>
<p>Indeed, <img src='http://s0.wp.com/latex.php?latex=P%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(x,y)' title='P(x,y)' class='latex' /> is the statement, </p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0%2C%5C+%5C+%7Cx-y%7C%3C%5Cepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0,&#92; &#92; |x-y|&lt;&#92;epsilon' title='&#92;epsilon&gt;0,&#92; &#92; |x-y|&lt;&#92;epsilon' class='latex' /></li>
<p>and <img src='http://s0.wp.com/latex.php?latex=Q%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q(x,y)' title='Q(x,y)' class='latex' /> is the statement <img src='http://s0.wp.com/latex.php?latex=x%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=y' title='x=y' class='latex' />.</p>
<p>Now the easiest way to prove this useful principle is to say the following. Suppose that <img src='http://s0.wp.com/latex.php?latex=x%5Cne+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;ne y' title='x&#92;ne y' class='latex' />. Then <img src='http://s0.wp.com/latex.php?latex=x-y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x-y' title='x-y' class='latex' /> is some number that isn&#8217;t zero (since if <img src='http://s0.wp.com/latex.php?latex=x-y%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x-y=0' title='x-y=0' class='latex' />, then adding <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> to both sides gives us that <img src='http://s0.wp.com/latex.php?latex=x%3Dy&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x=y' title='x=y' class='latex' />). But then <img src='http://s0.wp.com/latex.php?latex=%7Cx-y%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|x-y|' title='|x-y|' class='latex' /> is a positive number, which shows that the statement </p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0%2C%5C+%5C+%7Cx-y%7C%3C%5Cepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0,&#92; &#92; |x-y|&lt;&#92;epsilon' title='&#92;epsilon&gt;0,&#92; &#92; |x-y|&lt;&#92;epsilon' class='latex' /></li>
<p>is false. (The statement claims that a certain inequality holds for every positive number <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon' title='&#92;epsilon' class='latex' />. But when <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3D%7Cx-y%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon=|x-y|' title='&#92;epsilon=|x-y|' class='latex' /> it doesn&#8217;t hold.)</p>
<p>In that argument, we started with the assumption <img src='http://s0.wp.com/latex.php?latex=%5Cneg+Q%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg Q(x,y)' title='&#92;neg Q(x,y)' class='latex' /> (in other words, we assumed that <img src='http://s0.wp.com/latex.php?latex=x%5Cne+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;ne y' title='x&#92;ne y' class='latex' />) and ended up proving <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P%28x%2Cy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P(x,y)' title='&#92;neg P(x,y)' class='latex' /> (in other words, we proved that there is some positive number that is not greater than <img src='http://s0.wp.com/latex.php?latex=%7Cx-y%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|x-y|' title='|x-y|' class='latex' />).</p>
<p>What makes it easier to prove the contrapositive in the cases when it is indeed easier? It&#8217;s a bit tricky to say in general, so I&#8217;m not going to try too hard. Instead, let me offer some stylistic advice, which is that you should always start by trying to prove a statement in the obvious direct way, turning to the contrapositive if (i) you get stuck and (ii) you can see why you will be less stuck if you try to prove the contrapositive. Let&#8217;s see how that advice would have played out in the example above.</p>
<p>We would have started trying to prove the statement directly. So we&#8217;d have said, &#8220;Right, I&#8217;ve got two numbers <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' /> and I get to assume that their difference is smaller than any number I choose. Hmm &#8230; but what number do I choose?&#8221; At that point it wouldn&#8217;t be all that obvious. Note that one number we <em>can&#8217;t</em> choose is <img src='http://s0.wp.com/latex.php?latex=%7Cx-y%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|x-y|' title='|x-y|' class='latex' />, since we don&#8217;t know that that number is positive. In fact, we&#8217;re trying to prove that it is <em>not</em> positive!</p>
<p>At that point, feeling stuck, we wonder whether starting with the assumption that <img src='http://s0.wp.com/latex.php?latex=x%5Cne+y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;ne y' title='x&#92;ne y' class='latex' /> might help. And we see that it does, since it provides us with a positive real number, namely the difference between <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y' title='y' class='latex' />. So when we decide to try to prove the contrapositive instead, we are not doing so merely because we are stuck &#8212; there are plenty of times when we get stuck where turning to the contrapositive is clearly of no help whatsoever &#8212; but for the additional reason that <img src='http://s0.wp.com/latex.php?latex=%5Cneg+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg Q' title='&#92;neg Q' class='latex' /> gives us something that we needed in order to make progress.  </p>
<p>You may have heard this method called &#8220;proof by contradiction&#8221;. That isn&#8217;t strictly speaking correct (though this is a point that I myself didn&#8217;t understand until recently). There are <em>two</em> sorts of proof that make use of the negation of the conclusion. Just to give names to things, let&#8217;s suppose we are trying to prove the statement</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X%5C+%5C+P%28x%29%5Cimplies+Q%28x%29.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X&#92; &#92; P(x)&#92;implies Q(x).' title='x&#92;in X&#92; &#92; P(x)&#92;implies Q(x).' class='latex' /></li>
<p>One type of proof is what I illustrated above: you simply prove the contrapositive instead. That is, you prove</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X%5C+%5C+%5Cneg+Q%28x%29%5Cimplies%5Cneg+P%28x%29.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X&#92; &#92; &#92;neg Q(x)&#92;implies&#92;neg P(x).' title='x&#92;in X&#92; &#92; &#92;neg Q(x)&#92;implies&#92;neg P(x).' class='latex' /></li>
<p>Another type of proof is this. You assume that the result is false. That is, you assume </p>
<li>There exists <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X' title='x&#92;in X' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(x)' title='P(x)' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cneg+Q%28x%29.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg Q(x).' title='&#92;neg Q(x).' class='latex' /></li>
<p>and from that you proceed to deduce a contradiction: that is, you deduce a statement and its negation. There is <a href="http://mathoverflow.net/questions/12342/reductio-ad-absurdum-or-the-contrapositive">an interesting discussion of this difference at Mathoverflow</a>, which also contains a link to <a href="http://terrytao.wordpress.com/2009/11/05/the-no-self-defeating-object-argument/">a highly recommended blog post of Terence Tao</a>.</p>
<p><strong>More on converses.</strong></p>
<p>Now that we have talked about contrapositives, I can say something further about converses. We know that a statement is equivalent to its contrapositive. What happens if we start with a statement and look at the contrapositive of its converse? It will of course be equivalent to the converse, but what will it actually look like? </p>
<p>Well, if the original statement is <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' />, then the converse is <img src='http://s0.wp.com/latex.php?latex=Q%5Cimplies+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q&#92;implies P' title='Q&#92;implies P' class='latex' />, and the contrapositive of that is <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P%5Cimplies+%5Cneg+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P&#92;implies &#92;neg Q' title='&#92;neg P&#92;implies &#92;neg Q' class='latex' />. What this tells us is that <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P%5Cimplies%5Cneg+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P&#92;implies&#92;neg Q' title='&#92;neg P&#92;implies&#92;neg Q' class='latex' /> is another way of formulating the converse of <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' />. Or at least, that&#8217;s the bare truth-value way of putting it. For the more typical cases of interest, I would want to say that we can formulate the converse of</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X%5C+%5C+P%28x%29%5Cimplies+Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X&#92; &#92; P(x)&#92;implies Q(x)' title='x&#92;in X&#92; &#92; P(x)&#92;implies Q(x)' class='latex' />.</li>
<p>as</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X%5C+%5C+%5Cneg+P%28x%29%5Cimplies%5Cneg+Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X&#92; &#92; &#92;neg P(x)&#92;implies&#92;neg Q(x)' title='x&#92;in X&#92; &#92; &#92;neg P(x)&#92;implies&#92;neg Q(x)' class='latex' />.</li>
<p>The practical consequence of this is that if you are asked to prove a statement that looks like this:</p>
<li>Let <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X' title='x&#92;in X' class='latex' />. Prove that the following two statements are equivalent: (i) <img src='http://s0.wp.com/latex.php?latex=P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(x)' title='P(x)' class='latex' />; (ii) <img src='http://s0.wp.com/latex.php?latex=Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q(x)' title='Q(x)' class='latex' />.</li>
<p>and you start by showing that <img src='http://s0.wp.com/latex.php?latex=P%28x%29%5Cimplies+Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(x)&#92;implies Q(x)' title='P(x)&#92;implies Q(x)' class='latex' /> for some arbitrary <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X' title='x&#92;in X' class='latex' /> (which means that you&#8217;ve proved it for all <img src='http://s0.wp.com/latex.php?latex=x%5Cin+X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x&#92;in X' title='x&#92;in X' class='latex' />), then you have two options. Either you can say, &#8220;Conversely, let us assume that <img src='http://s0.wp.com/latex.php?latex=Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q(x)' title='Q(x)' class='latex' />,&#8221; and proceed to try to deduce <img src='http://s0.wp.com/latex.php?latex=P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(x)' title='P(x)' class='latex' />, or you can say, &#8220;Conversely, let us assume that <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P(x)' title='&#92;neg P(x)' class='latex' />,&#8221; and proceed to try to deduce <img src='http://s0.wp.com/latex.php?latex=%5Cneg+Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg Q(x)' title='&#92;neg Q(x)' class='latex' />. The one thing <em>not</em> to do is say, &#8220;Conversely, let us assume that <img src='http://s0.wp.com/latex.php?latex=%5Cneg+Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg Q(x)' title='&#92;neg Q(x)' class='latex' />,&#8221; and proceed to try to prove <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P(x)' title='&#92;neg P(x)' class='latex' />. If you do that, then you are proving the <em>contrapositive</em> of what you proved before &#8212; i.e., basically the same result &#8212; and not the converse. I have seen this mistake many times. I&#8217;ve even made it myself, though I think I&#8217;ve always noticed before I&#8217;ve got very far. I recommend that if you ever decide to prove the converse of a statement by considering negations of the component parts, you check very carefully that you&#8217;ve got things the right way round.</p>
<p>As an example of this, consider the pair of statements with which I began this post. Suppose that we want to prove the following equivalence.</p>
<li>For every positive integer <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, the following are equivalent: (i) <img src='http://s0.wp.com/latex.php?latex=n%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=1' title='n=1' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is a prime; (ii) <img src='http://s0.wp.com/latex.php?latex=%28n-1%29%21%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n-1)!+1' title='(n-1)!+1' class='latex' /> is divisible by <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />.</li>
<p>We might start by trying to show that (i) implies (ii). This turns out not to be obvious: it is a theorem that will be covered in the Numbers and Sets course, known as <a href="http://en.wikipedia.org/wiki/Wilson's_theorem"> Wilson&#8217;s theorem</a>. But suppose we&#8217;ve finished that and we wonder about the converse. If we go for the <img src='http://s0.wp.com/latex.php?latex=Q%5Cimplies+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q&#92;implies P' title='Q&#92;implies P' class='latex' /> option, then we&#8217;ll be using a weird condition &#8212; that <img src='http://s0.wp.com/latex.php?latex=%28n-1%29%21%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n-1)!+1' title='(n-1)!+1' class='latex' /> is divisible by <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> &#8212; and trying to deduce from that a rather negative property of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> &#8212; that it hasn&#8217;t got any non-trivial factors. If on the other hand we go for the <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P%5Cimplies%5Cneg+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P&#92;implies&#92;neg Q' title='&#92;neg P&#92;implies&#92;neg Q' class='latex' /> option then we&#8217;re much better off. This time our hypothesis is that <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is composite, which gives us some factors to play around with, and instead of being forced to <em>use</em> a weird condition we merely have to <em>disprove</em> it, which is somehow an easier task.</p>
<p>Indeed, the whole deduction turns out to be easy this way round: if <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> isn&#8217;t prime then we can write <img src='http://s0.wp.com/latex.php?latex=n%3Dab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=ab' title='n=ab' class='latex' /> with both <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> less than <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />. So if we write the numbers from 1 to <img src='http://s0.wp.com/latex.php?latex=n-1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n-1' title='n-1' class='latex' /> out in a line, then <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> will be amongst them, from which it follows that <img src='http://s0.wp.com/latex.php?latex=%28n-1%29%21&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n-1)!' title='(n-1)!' class='latex' /> is a multiple of <img src='http://s0.wp.com/latex.php?latex=ab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab' title='ab' class='latex' />, which equals <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />. Since it&#8217;s not possible for both <img src='http://s0.wp.com/latex.php?latex=%28n-1%29%21&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n-1)!' title='(n-1)!' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%28n-1%29%21%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n-1)!+1' title='(n-1)!+1' class='latex' /> to be multiples of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, we have proved that the strange condition doesn&#8217;t hold, just as we wanted.</p>
<p>Or have we? Can you see a mistake in the above argument? If you can&#8217;t, then there is an important moral: don&#8217;t be fooled by your notation into making assumptions that you haven&#8217;t justified. In the above argument I assumed that <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> were different: if I hadn&#8217;t then it wouldn&#8217;t have followed that <img src='http://s0.wp.com/latex.php?latex=%28n-1%29%21&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n-1)!' title='(n-1)!' class='latex' /> was a multiple of <img src='http://s0.wp.com/latex.php?latex=ab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='ab' title='ab' class='latex' />.</p>
<p>There are two ways of correcting the argument. One is to note that we can always write <img src='http://s0.wp.com/latex.php?latex=n%3Dab&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=ab' title='n=ab' class='latex' /> with <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> different unless <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is either a prime or the square of a prime. But if <img src='http://s0.wp.com/latex.php?latex=n%3Dp%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n=p^2' title='n=p^2' class='latex' /> for some prime <img src='http://s0.wp.com/latex.php?latex=p%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p,' title='p,' class='latex' /> then both <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=2p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2p' title='2p' class='latex' /> are less than <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> &#8212; except if <img src='http://s0.wp.com/latex.php?latex=p%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p=2' title='p=2' class='latex' /> &#8212; so <img src='http://s0.wp.com/latex.php?latex=%28n-1%29%21&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n-1)!' title='(n-1)!' class='latex' /> is a multiple of <img src='http://s0.wp.com/latex.php?latex=p%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p^2' title='p^2' class='latex' />. If <img src='http://s0.wp.com/latex.php?latex=p%3D2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p=2' title='p=2' class='latex' /> then we just look and see that <img src='http://s0.wp.com/latex.php?latex=%28n-1%29%21%2B1%3D7&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n-1)!+1=7' title='(n-1)!+1=7' class='latex' />, which is not a multiple of 4. </p>
<p>The other way of correcting the argument is cleaner because it doesn&#8217;t involve splitting into cases. We simply observe that if <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> is a factor of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> that&#8217;s less than <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, then <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> is also a factor of <img src='http://s0.wp.com/latex.php?latex=%28n-1%29%21&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n-1)!' title='(n-1)!' class='latex' />. Therefore <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> is not a factor of <img src='http://s0.wp.com/latex.php?latex=%28n-1%29%21%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n-1)!+1' title='(n-1)!+1' class='latex' />. (I&#8217;m also assuming of course that <img src='http://s0.wp.com/latex.php?latex=a%5Cne+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a&#92;ne 1' title='a&#92;ne 1' class='latex' />.) But <img src='http://s0.wp.com/latex.php?latex=a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a' title='a' class='latex' /> <em>is</em> a factor of any multiple of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, so <img src='http://s0.wp.com/latex.php?latex=%28n-1%29%21%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(n-1)!+1' title='(n-1)!+1' class='latex' /> cannot be a multiple of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />. </p>
<p>So here we have an example of an equivalence with two properties mentioned above: one direction is easier than the other, and to prove the converse of a statement <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' /> it is easier to show that <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P%5Cimplies%5Cneg+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P&#92;implies&#92;neg Q' title='&#92;neg P&#92;implies&#92;neg Q' class='latex' /> than it is to show that <img src='http://s0.wp.com/latex.php?latex=Q%5Cimplies+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q&#92;implies P' title='Q&#92;implies P' class='latex' />. </p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3346/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3346&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/10/05/basic-logic-relationships-between-statements-converses-and-contrapositives/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
		<item>
		<title>Basic logic &#8212; relationships between statements &#8212; negation</title>
		<link>http://gowers.wordpress.com/2011/10/02/basic-logic-relationships-between-statements-negation/</link>
		<comments>http://gowers.wordpress.com/2011/10/02/basic-logic-relationships-between-statements-negation/#comments</comments>
		<pubDate>Sun, 02 Oct 2011 09:37:05 +0000</pubDate>
		<dc:creator>gowers</dc:creator>
				<category><![CDATA[Basic logic]]></category>
		<category><![CDATA[Cambridge teaching]]></category>

		<guid isPermaLink="false">http://gowers.wordpress.com/?p=3270</guid>
		<description><![CDATA[I want to talk in the next couple of posts about transformations that can be applied to a statement. The three transformations I plan to discuss are forming the negation, the converse, and the contrapositive. For those who like an abstract definition to keep them going, let me quickly give the three relevant ones here. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3270&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I want to talk in the next couple of posts about transformations that can be applied to a statement. The three transformations I plan to discuss are forming the negation, the converse, and the contrapositive. For those who like an abstract definition to keep them going, let me quickly give the three relevant ones here. If <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> is a statement, then &#8220;not <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' />&#8220;, sometimes written in symbolic form as <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P,' title='&#92;neg P,' class='latex' /> is its negation. If <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> has the form <img src='http://s0.wp.com/latex.php?latex=p%5Cimplies+q%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p&#92;implies q,' title='p&#92;implies q,' class='latex' /> then the converse of <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> is the statement <img src='http://s0.wp.com/latex.php?latex=q%5Cimplies+p.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q&#92;implies p.' title='q&#92;implies p.' class='latex' /> And finally, again if <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> has the form <img src='http://s0.wp.com/latex.php?latex=p%5Cimplies+q%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p&#92;implies q,' title='p&#92;implies q,' class='latex' /> the contrapositive is the statement <img src='http://s0.wp.com/latex.php?latex=%5Cneg+q%5Cimplies%5Cneg+p.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg q&#92;implies&#92;neg p.' title='&#92;neg q&#92;implies&#92;neg p.' class='latex' /></p>
<p>If that was too abstract for you, then maybe you&#8217;ll be happier to pick the idea up by looking at some examples. (I myself find that easier. I like to see enough examples for the abstract concept to become obvious. But others seem to prefer the abstract concept in order to understand the point of the examples. In this post I am indulging those kinds of people.)<br />
<span id="more-3270"></span></p>
<p><strong>Negation.</strong></p>
<p>It isn&#8217;t that difficult to <em>define</em> negation. In symbolic terms, you negate a statement by sticking <img src='http://s0.wp.com/latex.php?latex=%5Cneg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg' title='&#92;neg' class='latex' /> in front of it (putting the whole statement in brackets if it is complicated enough to need it). You can then read <img src='http://s0.wp.com/latex.php?latex=%5Cneg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg' title='&#92;neg' class='latex' /> as &#8220;not&#8221;, or if you want to be absolutely sure you are getting the meaning right, as &#8220;it is not the case that&#8221;. It may seem a bit odd to have a whole post on this, when I&#8217;ve already devoted a post to the word &#8220;not&#8221;. But I wrote that post before writing about implication and quantifiers. That leaves me with things that still need to be said.</p>
<p>Let&#8217;s begin with a simple statement:</p>
<li><img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> has at least three distinct prime factors.</li>
<p>According to what I said above, the negation of that statement is</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cneg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg' title='&#92;neg' class='latex' /> (<img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> has at least three distinct prime factors)</li>
<p>which can be translated into a more natural sentence in various ways. One possible translation is this.</p>
<li><img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> does not have at least three distinct prime factors.</li>
<p>However, you should be very wary about simply moving the word &#8220;not&#8221; to a place where the sentence reads more naturally, because <em>this can change the meaning of the sentence</em>.</p>
<p>For example, take the following sentence from Goldilocks and the Three Bears.</p>
<li>Somebody has been sitting in <em>my</em> chair.</li>
<p>We can form the negation in the unnatural way as follows.</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cneg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg' title='&#92;neg' class='latex' /> (Somebody has been sitting in <em>my</em> chair.)</li>
<p>Suppose we were now to move the &#8220;not&#8221; to where the main verb is. We would get this:</p>
<li>Somebody has not been sitting in <em>my</em> chair.</li>
<p>But that is not the same as saying</p>
<li>It is not the case that somebody has been sitting in <em>my</em> chair.</li>
<p>That last sentence, which is the correct negation, is telling us that <em>nobody</em> has been sitting in <em>my</em> chair, whereas the careless moving of the &#8220;not&#8221; produced a sentence that merely told us that there is <em>somebody</em> who hasn&#8217;t been sitting in that chair.</p>
<p>Let us return to the sentence</p>
<li><img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> has at least three distinct prime factors.</li>
<p>If we want to negate it, we should think to ourselves, &#8220;What has to happen for this sentence to be false?&#8221; In this case it is quite simple: we know that the number of distinct prime factors of <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> is a non-negative integer, so the only way for it not to be the case that that non-negative integer is at least 3 is for it to be at most 2. So a clean way of expressing the negation of this sentence is</p>
<li><img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> has at most two distinct prime factors.</li>
<p><strong>Negating a statement that begins with a quantifier.</strong></p>
<p>It is very important to know how to negate sentences where quantifiers are involved. (The Goldilocks example, with the word &#8220;somebody&#8221;, is an example of the kind of mistake it is possible to make if you don&#8217;t.) Fortunately, negating sentences with quantifiers is extremely easy once you know how to do it. In fact, it is so easy that you can do it completely automatically <em>even if you have no understanding whatsoever of the statement that you are negating</em>.</p>
<p>To explain this, let me begin with two simple examples, which both involve just one quantifier. First let&#8217;s go for a universal quantifier. I&#8217;ll write a sentence in words and also in more symbolic form. Let&#8217;s write <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> for the set of all prime numbers.</p>
<li>Every prime number is odd.</li>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cforall+p%5Cin+P%5C+%5C+p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall p&#92;in P&#92; &#92; p' title='&#92;forall p&#92;in P&#92; &#92; p' class='latex' />  is odd.</li>
<p>What has to be true to make it <em>not</em> the case that every prime number is odd? Do we need every prime number to be even? Not at all. All we need is for there to be at least one exception: that is, we need there to <em>exist</em> a prime number that is even. In other words, the negation of the above statement, again in two forms, is this.</p>
<li>Some prime number is even.</li>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cexists+p%5Cin+P%5C+%5C+p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists p&#92;in P&#92; &#92; p' title='&#92;exists p&#92;in P&#92; &#92; p' class='latex' /> is even.</li>
<p>The general rule here is that if you ever have a statement of the form</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+X%5C+%5C+Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in X&#92; &#92; Q(x)' title='&#92;forall x&#92;in X&#92; &#92; Q(x)' class='latex' /></li>
<p>then its negation is</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cexists+x%5Cin+X%5C+%5C+%5Cneg+Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists x&#92;in X&#92; &#92; &#92;neg Q(x)' title='&#92;exists x&#92;in X&#92; &#92; &#92;neg Q(x)' class='latex' /></li>
<p>Here <img src='http://s0.wp.com/latex.php?latex=Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q' title='Q' class='latex' /> is some statement about things like x, and <img src='http://s0.wp.com/latex.php?latex=Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q(x)' title='Q(x)' class='latex' /> is what you get when you apply <img src='http://s0.wp.com/latex.php?latex=Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q' title='Q' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> in particular. Perhaps I can say this a bit more wordily. If you ever have a statement of the form</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> in <img src='http://s0.wp.com/latex.php?latex=X%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X,' title='X,' class='latex' /> fact F holds for x.</li>
<p>then its negation will be</p>
<li>There is an <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' /> in <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> for which fact F does not hold.</li>
<p>In the case of the primes-being-odd example, the original statement was false and its negation was true (since 2 is a prime).</p>
<p>If you have read the post on NOT, you may remember a general rule that I mentioned: that if you negate something strong you get something weak, and vice versa. That applied in the example above. The statement, &#8220;Every prime is odd,&#8221; is strong because it is telling us about <em>all</em> primes. (The fact that it is false is neither here nor there. It is both strong and false.) We therefore expect its negation to be weak, and indeed that is the case: it merely tells us that <em>at least one</em> prime has a certain property (that of not being odd). In principle there might be just one non-odd prime with all the rest being odd, and the statement would still be true. Oh, and hang on &#8230; that is actually the case.</p>
<p>There are therefore no prizes for guessing what happens when you negate a statement that begins with an existential quantifier. You should expect something strong, and something strong is indeed what you get. Let&#8217;s take a non-mathematical example.</p>
<li>Someone in the world is over 125 years old.</li>
<p>Putting that in symbolic form gives us this. I&#8217;ll let <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> stand for the set of all living human beings and I&#8217;ll write <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> for a typical element of <img src='http://s0.wp.com/latex.php?latex=H&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H' title='H' class='latex' /> (that is, a typical living human being). </p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cexists+h%5Cin+H%5C+%5C+h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists h&#92;in H&#92; &#92; h' title='&#92;exists h&#92;in H&#92; &#92; h' class='latex' /> is over 125 years old.</li>
<p>For that to be false, there cannot be a single living human being who is over 125 years old. That is, for <em>every</em> living human being <img src='http://s0.wp.com/latex.php?latex=h%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h,' title='h,' class='latex' /> <img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> is <em>not</em> over 125 years old.</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cforall+h%5Cin+H%5C+%5C+h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall h&#92;in H&#92; &#92; h' title='&#92;forall h&#92;in H&#92; &#92; h' class='latex' /> is not over 125 years old.</li>
<p>In general, the rule is just like the &#8220;for all&#8221; rule but with the quantifiers the other way round. That is, the negation of</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cexists+x%5Cin+X%5C+Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists x&#92;in X&#92; Q(x)' title='&#92;exists x&#92;in X&#92; Q(x)' class='latex' /></li>
<p>is</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cforall+x%5Cin+X%5C+%5Cneg+Q%28x%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall x&#92;in X&#92; &#92;neg Q(x)' title='&#92;forall x&#92;in X&#92; &#92;neg Q(x)' class='latex' /></li>
<p>Going back to the example about very old people, perhaps you will object that the statement that there exists somebody who is over 125 years old is not that weak after all, since being over 125 years old is so amazingly unexpected. That&#8217;s a reasonable point, so let&#8217;s take the sentence apart a little. The statement, &#8220;<img src='http://s0.wp.com/latex.php?latex=h&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h' title='h' class='latex' /> is over 125 years old,&#8221; is indeed an amazingly strong thing to say about <img src='http://s0.wp.com/latex.php?latex=h.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='h.' title='h.' class='latex' /> However, if we want to dilute that strength as much as we possibly can without going to the homoeopathic extreme of saying nothing at all, then the best we can do is to say that <em>somebody somewhere</em> has exceeded the age of 125, without giving any more information. So the quantifier is weak, but the bit inside the quantifier is strong. When we negate it, we reverse this. Now we are saying that <em>everybody</em> (wow, that&#8217;s quite strong) is at most 125 (oh &#8230; what we&#8217;re actually saying about everybody isn&#8217;t that surprising).</p>
<p>Something else that you may have noticed if you are feeling alert is that there is a close parallel between what I have just been discussing and de Morgan&#8217;s laws (which are discussed towards the end of my post on the logical connective NOT). Recall that de Morgan&#8217;s laws tell us that to negate an AND you turn it into an OR and put the NOT by the statements that were ORed together. That is, <img src='http://s0.wp.com/latex.php?latex=%5Cneg%28P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg(P' title='&#92;neg(P' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=Q%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q)' title='Q)' class='latex' /> becomes <img src='http://s0.wp.com/latex.php?latex=%5Cneg+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg P' title='&#92;neg P' class='latex' /> or <img src='http://s0.wp.com/latex.php?latex=%5Cneg+Q.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg Q.' title='&#92;neg Q.' class='latex' /> While I&#8217;m at it, let me reveal that the symbols <img src='http://s0.wp.com/latex.php?latex=%5Cwedge&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;wedge' title='&#92;wedge' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cvee&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vee' title='&#92;vee' class='latex' /> are often used for AND and OR. With the help of those, we can write de Morgan&#8217;s laws as</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cneg%28P%5Cwedge+Q%29%5Ciff+%5Cneg+P%5Cvee%5Cneg+Q.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg(P&#92;wedge Q)&#92;iff &#92;neg P&#92;vee&#92;neg Q.' title='&#92;neg(P&#92;wedge Q)&#92;iff &#92;neg P&#92;vee&#92;neg Q.' class='latex' /></li>
<p>and</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cneg%28P%5Cvee+Q%29%5Ciff+%5Cneg+P%5Cwedge%5Cneg+Q.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg(P&#92;vee Q)&#92;iff &#92;neg P&#92;wedge&#92;neg Q.' title='&#92;neg(P&#92;vee Q)&#92;iff &#92;neg P&#92;wedge&#92;neg Q.' class='latex' /></li>
<p>The fact that NOT turns AND into OR and vice versa feels quite like the fact that NOT turns <img src='http://s0.wp.com/latex.php?latex=%5Cforall&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall' title='&#92;forall' class='latex' /> into <img src='http://s0.wp.com/latex.php?latex=%5Cexists&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists' title='&#92;exists' class='latex' /> and vice versa (with the NOT shunted inside in all cases). And we can see why this is if we think of a universal quantifier as an enormous AND and an existential quantifier as an enormous OR. For example, the statement</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cforall+p%5Cin+P%5C+%5C+p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall p&#92;in P&#92; &#92; p' title='&#92;forall p&#92;in P&#92; &#92; p' class='latex' /> is odd.</li>
<p>that we considered earlier could be thought of as a concise way of saying this:</p>
<li>2 is odd and 3 is odd and 5 is odd and 7 is odd and 11 is odd and 13 is odd and 17 is odd and 19 is odd and 23 is odd and 29 is odd and 31 is odd and 37 is odd and 41 is odd and 43 is odd and 47 is odd and 53 is odd and &#8230;</li>
<p>So when we negate it, we should obtain a concise way of saying this:</p>
<li>2 is even or 3 is even or 5 is even or 7 is even or 11 is even or 13 is even or 17 is even or 19 is even or 23 is even or 29 is even or 31 is even or 37 is even or 41 is even or 43 is even or 47 is even or 53 is even or &#8230;</li>
<p>and indeed we do, since</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cexists+p%5Cin+P%5C+%5C+p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists p&#92;in P&#92; &#92; p' title='&#92;exists p&#92;in P&#92; &#92; p' class='latex' /> is even.</li>
<p>is saying precisely that.</p>
<p>Similarly, the statement that somebody in the world is at least 125 years old is shorthand for, &#8220;I am at least 125 years old or you are at least 125 years old or Lord Rees of Ludlow is at least 125 years old or George Osborne is at least 125 years old or Cheryl Cole is at least 125 years old or Andy Murray is at least 125 years old or the younger son of the people who used to live in the house next door until a few years ago is at least 125 years old or &#8230;&#8221; You get the idea.</p>
<p><strong>Negating a statement that begins with several quantifiers.</strong></p>
<p>Right, that&#8217;s single quantifiers dealt with. But what happens if we&#8217;ve got a more complicated sentence like the definition of convergence that was discussed in the post on quantifiers? That was as follows.</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cforall+%5Cepsilon%3E0%5C+%5C+%5Cexists+N%5Cin%5Cmathbb%7BN%7D%5C+%5C+%5Cforall+n%5Cgeq+N%5C+%5C+%7Ca_n-a%7C%3C%5Cepsilon.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall &#92;epsilon&gt;0&#92; &#92; &#92;exists N&#92;in&#92;mathbb{N}&#92; &#92; &#92;forall n&#92;geq N&#92; &#92; |a_n-a|&lt;&#92;epsilon.' title='&#92;forall &#92;epsilon&gt;0&#92; &#92; &#92;exists N&#92;in&#92;mathbb{N}&#92; &#92; &#92;forall n&#92;geq N&#92; &#92; |a_n-a|&lt;&#92;epsilon.' class='latex' /></li>
<p>If you still feel that you don&#8217;t really understand this sentence, that&#8217;s OK. In fact, it&#8217;s almost better than if you do understand it, since then you really will see just how mechanical it is to negate statements that involve quantifiers.</p>
<p>The first thing to understand is that there is an implicit bracketing going on. We could if we wanted write the statement as follows.</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cforall+%5Cepsilon%3E0%5C+%5C+%28%5Cexists+N%5Cin%5Cmathbb%7BN%7D%5C+%5C+%5Cforall+n%5Cgeq+N%5C+%5C+%7Ca_n-a%7C%3C%5Cepsilon%29.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall &#92;epsilon&gt;0&#92; &#92; (&#92;exists N&#92;in&#92;mathbb{N}&#92; &#92; &#92;forall n&#92;geq N&#92; &#92; |a_n-a|&lt;&#92;epsilon).' title='&#92;forall &#92;epsilon&gt;0&#92; &#92; (&#92;exists N&#92;in&#92;mathbb{N}&#92; &#92; &#92;forall n&#92;geq N&#92; &#92; |a_n-a|&lt;&#92;epsilon).' class='latex' /></li>
<p>In other words, we could think of it as saying, &#8220;For every <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon' title='&#92;epsilon' class='latex' /> something weird happens.&#8221; If you don&#8217;t find it weird, that&#8217;s not a problem &#8212; what matters is that we&#8217;ve got some statement that involves <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon.' title='&#92;epsilon.' class='latex' /> (In theory it doesn&#8217;t even have to involve <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon,' title='&#92;epsilon,' class='latex' /> but if it didn&#8217;t then it <em>would</em> be a bit weird. It would be like saying, &#8220;For every prime number the president of the United States is Barack Obama.&#8221; That&#8217;s a true statement, but the role of prime numbers is a bit puzzling.)</p>
<p>But now we have reduced the task to something we know how to do. If we want to negate a statement, the first step is to put the whole statement in brackets and stick a <img src='http://s0.wp.com/latex.php?latex=%5Cneg&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg' title='&#92;neg' class='latex' /> on the front. In our case, we get the following.</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cneg%28%5Cforall+%5Cepsilon%3E0%5C+%5C+%28%5Cexists+N%5Cin%5Cmathbb%7BN%7D%5C+%5C+%5Cforall+n%5Cgeq+N%5C+%5C+%7Ca_n-a%7C%3C%5Cepsilon%29%29.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg(&#92;forall &#92;epsilon&gt;0&#92; &#92; (&#92;exists N&#92;in&#92;mathbb{N}&#92; &#92; &#92;forall n&#92;geq N&#92; &#92; |a_n-a|&lt;&#92;epsilon)).' title='&#92;neg(&#92;forall &#92;epsilon&gt;0&#92; &#92; (&#92;exists N&#92;in&#92;mathbb{N}&#92; &#92; &#92;forall n&#92;geq N&#92; &#92; |a_n-a|&lt;&#92;epsilon)).' class='latex' /></li>
<p>Now we use the rule that we can change <img src='http://s0.wp.com/latex.php?latex=%5Cneg%5Cforall&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;neg&#92;forall' title='&#92;neg&#92;forall' class='latex' /> into <img src='http://s0.wp.com/latex.php?latex=%5Cexists%5Cneg.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists&#92;neg.' title='&#92;exists&#92;neg.' class='latex' /> That is, to negate a &#8220;for every&#8221; you can change it to &#8220;there exists&#8221; and bring the negation inside. Doing that here we get the following.</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cexists+%5Cepsilon%3E0%5C+%5C+%5Cneg%28%5Cexists+N%5Cin%5Cmathbb%7BN%7D%5C+%5C+%5Cforall+n%5Cgeq+N%5C+%5C+%7Ca_n-a%7C%3C%5Cepsilon%29%29.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists &#92;epsilon&gt;0&#92; &#92; &#92;neg(&#92;exists N&#92;in&#92;mathbb{N}&#92; &#92; &#92;forall n&#92;geq N&#92; &#92; |a_n-a|&lt;&#92;epsilon)).' title='&#92;exists &#92;epsilon&gt;0&#92; &#92; &#92;neg(&#92;exists N&#92;in&#92;mathbb{N}&#92; &#92; &#92;forall n&#92;geq N&#92; &#92; |a_n-a|&lt;&#92;epsilon)).' class='latex' /></li>
<p>So far all we&#8217;ve done is apply the rule that works with a single quantifier, and completely ignored what&#8217;s in the inner set of brackets. But now if we stop ignoring that, we see that we&#8217;ve still got a sentence with quantifiers that needs negating. What can we do? </p>
<p>Presumably you&#8217;ve spotted that we can just repeat the process. We are in exactly the situation we were in before (a &#8220;not&#8221; outside some quantifiers) so we can do exactly the same thing (make the &#8220;not&#8221; and the outer quantifier switch places, and turn the outer quantifier into the other type). I won&#8217;t give all the gory details. I&#8217;ll just say that you can bring the NOT past <em>all</em> the quantifiers as long as you swap them round. The result of doing that is the following.</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cexists+%5Cepsilon%3E0%5C+%5C+%5Cforall+N%5Cin%5Cmathbb%7BN%7D%5C+%5C+%5Cexists+n%5Cgeq+N%5C+%5C+%5Cneg%28%7Ca_n-a%7C%3C%5Cepsilon%29.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists &#92;epsilon&gt;0&#92; &#92; &#92;forall N&#92;in&#92;mathbb{N}&#92; &#92; &#92;exists n&#92;geq N&#92; &#92; &#92;neg(|a_n-a|&lt;&#92;epsilon).' title='&#92;exists &#92;epsilon&gt;0&#92; &#92; &#92;forall N&#92;in&#92;mathbb{N}&#92; &#92; &#92;exists n&#92;geq N&#92; &#92; &#92;neg(|a_n-a|&lt;&#92;epsilon).' class='latex' /></li>
<p>It remains to negate the bit right inside that doesn&#8217;t involve quantifiers. Doing that gives us the following statement.</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cexists+%5Cepsilon%3E0%5C+%5C+%5Cforall+N%5Cin%5Cmathbb%7BN%7D%5C+%5C+%5Cexists+n%5Cgeq+N%5C+%5C+%7Ca_n-a%7C%5Cgeq%5Cepsilon.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists &#92;epsilon&gt;0&#92; &#92; &#92;forall N&#92;in&#92;mathbb{N}&#92; &#92; &#92;exists n&#92;geq N&#92; &#92; |a_n-a|&#92;geq&#92;epsilon.' title='&#92;exists &#92;epsilon&gt;0&#92; &#92; &#92;forall N&#92;in&#92;mathbb{N}&#92; &#92; &#92;exists n&#92;geq N&#92; &#92; |a_n-a|&#92;geq&#92;epsilon.' class='latex' /></li>
<p>Here I&#8217;m using the rule that negating &lt; turns it into <img src='http://s0.wp.com/latex.php?latex=%5Cgeq.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;geq.' title='&#92;geq.' class='latex' /> (Similarly, negating &gt; turns it into <img src='http://s0.wp.com/latex.php?latex=%5Cleq&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;leq' title='&#92;leq' class='latex' />.) Note that strict inequalities becomes non-strict ones, which is yet another example of strong statements becoming weak ones.</p>
<p><strong>Negating implications.</strong></p>
<p>There&#8217;s one other thing you need to know if you want to carry out the technique just described, which is how to negate a statement of the form <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q.' title='P&#92;implies Q.' class='latex' /> Consider, for example, the following statement: every prime of the form <img src='http://s0.wp.com/latex.php?latex=4m%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4m+1' title='4m+1' class='latex' /> can be written as a sum of two squares. In a more symbolic form, we might represent this as follows.</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cforall+p%5Cin+P%5C+%5C+%28p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall p&#92;in P&#92; &#92; (p' title='&#92;forall p&#92;in P&#92; &#92; (p' class='latex' /> is of the form <img src='http://s0.wp.com/latex.php?latex=4m%2B1%5C+%5Cimplies%5C+p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4m+1&#92; &#92;implies&#92; p' title='4m+1&#92; &#92;implies&#92; p' class='latex' /> can be written as a sum of two squares).</li>
<p>As a first step towards negating this, we do the trick discussed above, flipping all the quantifiers (in this case only one), which gives us this.</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cexists+p%5Cin+P%5C+%5C+%5Cneg%28p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists p&#92;in P&#92; &#92; &#92;neg(p' title='&#92;exists p&#92;in P&#92; &#92; &#92;neg(p' class='latex' /> is of the form <img src='http://s0.wp.com/latex.php?latex=4m%2B1%5C+%5Cimplies%5C+p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4m+1&#92; &#92;implies&#92; p' title='4m+1&#92; &#92;implies&#92; p' class='latex' /> can be written as a sum of two squares).</li>
<p>The question is, what do we do now? We&#8217;ve got a NOT outside a statement of the form <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q%3A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q:' title='P&#92;implies Q:' class='latex' /> how do we simplify it further? </p>
<p>Let&#8217;s apply a few principles that I&#8217;ve been going on about. The first is that if you want to understand a negation, or check whether what you&#8217;ve claimed is a negation really is a negation, you should always ask yourself the question, &#8220;What needs to be true for this statement to be false?&#8221; So let&#8217;s do that here. What needs to be true for it <em>not</em> to be the case that if <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is of the form <img src='http://s0.wp.com/latex.php?latex=4m%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4m+1' title='4m+1' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> can be written as a sum of two squares? And to answer <em>that</em> question we go back further to the principle that says that the only thing that can make an implication <img src='http://s0.wp.com/latex.php?latex=P%5Cimplies+Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P&#92;implies Q' title='P&#92;implies Q' class='latex' /> false is if <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> is true and <img src='http://s0.wp.com/latex.php?latex=Q&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Q' title='Q' class='latex' /> is false. So there&#8217;s the answer handed to us on a plate. It tells us that the only way for</p>
<li><img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is of the form <img src='http://s0.wp.com/latex.php?latex=4m%2B1%5C+%5Cimplies%5C+p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4m+1&#92; &#92;implies&#92; p' title='4m+1&#92; &#92;implies&#92; p' class='latex' /> can be written as a sum of two squares</li>
<p>to be false is if <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is of the form <img src='http://s0.wp.com/latex.php?latex=4m%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4m+1' title='4m+1' class='latex' /> but <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> cannot be written as a sum of two squares. So now we&#8217;ve completed our negation. It reads as follows.</p>
<li><img src='http://s0.wp.com/latex.php?latex=%5Cexists+p%5Cin+P%5C+%5C+p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;exists p&#92;in P&#92; &#92; p' title='&#92;exists p&#92;in P&#92; &#92; p' class='latex' /> is of the form <img src='http://s0.wp.com/latex.php?latex=4m%2B1%5C+&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4m+1&#92; ' title='4m+1&#92; ' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> cannot be written as a sum of two squares).</li>
<p>By the way, it is a famous theorem that a prime can be written as a sum of two squares if and only if it is equal to 2 or is of the form <img src='http://s0.wp.com/latex.php?latex=4m%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4m+1' title='4m+1' class='latex' /> for some positive integer <img src='http://s0.wp.com/latex.php?latex=m.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m.' title='m.' class='latex' /> We knew in advance that precisely one out of the original statement and its negation had to be true. It happens to be the original statement that is true and the negation that is false.</p>
<p><strong>Do implications need quantifiers?</strong></p>
<p>There&#8217;s a small point in there that confuses some people (including me when I was an undergraduate). What does the statement</p>
<li><img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is of the form <img src='http://s0.wp.com/latex.php?latex=4m%2B1%5C+%5Cimplies%5C+p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4m+1&#92; &#92;implies&#92; p' title='4m+1&#92; &#92;implies&#92; p' class='latex' /> can be written as a sum of two squares</li>
<p>mean? A very natural way of reading it is as a general statement about primes <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> (the context making it clear that we are talking about primes here). In other words, it sort of feels as though this statement is saying that every prime <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> that can be written in the form <img src='http://s0.wp.com/latex.php?latex=4m%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4m+1' title='4m+1' class='latex' /> can also be written as a sum of two squares. As I discussed at some length in my post about IMPLIES, we think about <em>properties</em>. In this case, we think of the property is-of-the-form-<img src='http://s0.wp.com/latex.php?latex=4m%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4m+1' title='4m+1' class='latex' /> as implying the property is-expressible-as-a-sum-of-two-squares, but it is much cleaner to write</p>
<li><img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is of the form <img src='http://s0.wp.com/latex.php?latex=4m%2B1%5C+%5Cimplies%5C+p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4m+1&#92; &#92;implies&#92; p' title='4m+1&#92; &#92;implies&#92; p' class='latex' /> can be written as a sum of two squares</li>
<p>and simply understand that <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is an arbitrary prime than it is to write something like</p>
<li>(is of the form <img src='http://s0.wp.com/latex.php?latex=4m%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='4m+1' title='4m+1' class='latex' />) implies (can be written as a sum of two squares) when applied to primes</li>
<p>However, my actual interpretation of &#8220;implies&#8221; when I was analysing the sentence and its negation was not the property interpretation but the truth-value interpretation. For the purposes of the above statement I was thinking of <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> as fixed (but unknown) and the &#8220;implies&#8221; symbol was merely telling me that for that fixed <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> it was not the case that the left-hand side was true and the right-hand side false. Only <em>after</em> I stuck <img src='http://s0.wp.com/latex.php?latex=%5Cforall+p%5Cin+P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;forall p&#92;in P' title='&#92;forall p&#92;in P' class='latex' /> on the outside did that become a general statement about the primes. To avoid confusion it&#8217;s probably best to stick to this convention. And if you find it hard to hold the entire convention and its explanation in your head, then a simpler rule of thumb may perhaps suffice, which is</p>
<li>If in doubt, put in the quantifier.</li>
<p>It is quite hard to avoid all confusion on this score, because people <em>do</em> say things like this. &#8220;Let us fix a prime <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> in the set <img src='http://s0.wp.com/latex.php?latex=A.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A.' title='A.' class='latex' /> Since all elements of <img src='http://s0.wp.com/latex.php?latex=A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='A' title='A' class='latex' /> leave a remainder of 1 when you divide by 4, this implies, by a theorem of Fermat, that <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> can be written as a sum of two squares.&#8221; From the way this sentence is written, it looks as though the implication is</p>
<li>(<img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> is prime and <img src='http://s0.wp.com/latex.php?latex=p%5Cin+A&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p&#92;in A' title='p&#92;in A' class='latex' />) implies (<img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> can be written as a sum of two squares)</li>
<p>However, it is implicit in the discussion that the <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> in question is not <em>genuinely</em> fixed. If you asked the person who was talking, &#8220;You&#8217;ve just said you fixed <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. So can you tell us what prime you&#8217;re talking about please?&#8221; you would get short shrift. When we say that we have &#8220;fixed&#8221; <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> it is a sort of lie: actually we are talking about an arbitrary, general <img src='http://s0.wp.com/latex.php?latex=p%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p,' title='p,' class='latex' /> which is another way of saying that we are talking about all <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />. So the reason that the confusion occurs is that there is a very useful informal way of talking that hides the quantifiers. It&#8217;s often fine to do that, just as it is often fine to wear jeans. But you don&#8217;t wear jeans at a black-tie dinner, and you don&#8217;t leave out the quantifiers when you are writing out a proof carefully (and at this early stage you should be careful with all your proofs).</p>
<p>To show the sort of confusion that can arise if you disregard this advice, let&#8217;s go back to the definition of convergence. Recall that that was the following. I&#8217;ll give a slightly wordy version.</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0' title='&#92;epsilon&gt;0' class='latex' /> there exists a positive integer <img src='http://s0.wp.com/latex.php?latex=N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N' title='N' class='latex' /> such that for every <img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N' title='n&#92;geq N' class='latex' /> <img src='http://s0.wp.com/latex.php?latex=%7Ca_n-a%7C%3C%5Cepsilon.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a_n-a|&lt;&#92;epsilon.' title='|a_n-a|&lt;&#92;epsilon.' class='latex' /></li>
<p>Suppose we decided to write it instead like this.</p>
<li>For every <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0' title='&#92;epsilon&gt;0' class='latex' /> there exists a positive integer <img src='http://s0.wp.com/latex.php?latex=N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N' title='N' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N' title='n&#92;geq N' class='latex' /> implies that <img src='http://s0.wp.com/latex.php?latex=%7Ca_n-a%7C%3C%5Cepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a_n-a|&lt;&#92;epsilon' title='|a_n-a|&lt;&#92;epsilon' class='latex' />.</li>
<p>It&#8217;s not that hard to see what that second formulation means: the word &#8220;implies&#8221; has its &#8220;property&#8221; sense, and is telling us that being at least <img src='http://s0.wp.com/latex.php?latex=N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N' title='N' class='latex' /> guarantees being far enough along in the sequence to be within <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon' title='&#92;epsilon' class='latex' /> of <img src='http://s0.wp.com/latex.php?latex=a.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a.' title='a.' class='latex' /> However, this mix of formality and informality (jeans at the black-tie dinner) is dangerous. Suppose, for instance, that we decide to negate the sentence in its second form. Applying the mechanical procedure, we might begin by getting to this.</p>
<li>There exists some <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0' title='&#92;epsilon&gt;0' class='latex' /> such that for every <img src='http://s0.wp.com/latex.php?latex=N%5Cin%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N&#92;in&#92;mathbb{N}' title='N&#92;in&#92;mathbb{N}' class='latex' /> NOT (<img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N' title='n&#92;geq N' class='latex' /> implies that <img src='http://s0.wp.com/latex.php?latex=%7Ca_n-a%7C%3C%5Cepsilon%29.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a_n-a|&lt;&#92;epsilon).' title='|a_n-a|&lt;&#92;epsilon).' class='latex' /></li>
<p>Our remaining task is to negate the bit in brackets. Well, we say, the only way that can fail is if <img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N' title='n&#92;geq N' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%7Ca_n-a%7C%5Cgeq%5Cepsilon.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a_n-a|&#92;geq&#92;epsilon.' title='|a_n-a|&#92;geq&#92;epsilon.' class='latex' /> So we might write this.</p>
<li>There exists some <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0' title='&#92;epsilon&gt;0' class='latex' /> such that for every <img src='http://s0.wp.com/latex.php?latex=N%5Cin%5Cmathbb%7BN%7D%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N&#92;in&#92;mathbb{N},' title='N&#92;in&#92;mathbb{N},' class='latex' /> <img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N' title='n&#92;geq N' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%7Ca_n-a%7C%5Cgeq%5Cepsilon.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a_n-a|&#92;geq&#92;epsilon.' title='|a_n-a|&#92;geq&#92;epsilon.' class='latex' /></li>
<p>However, this doesn&#8217;t make sense. What is <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />? It&#8217;s not at all clear what the above sentence means. The mistake was to treat the statement &#8220;<img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N' title='n&#92;geq N' class='latex' /> implies that <img src='http://s0.wp.com/latex.php?latex=%7Ca_n-a%7C%3C%5Cepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a_n-a|&lt;&#92;epsilon' title='|a_n-a|&lt;&#92;epsilon' class='latex' />&#8221; as though it were referring to a single <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' /> and using the truth-value notion of &#8220;implies&#8221;, when in fact it was an informal way of expressing the <em>general</em> statement</p>
<li>For every positive integer <img src='http://s0.wp.com/latex.php?latex=n&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n' title='n' class='latex' />, if <img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N' title='n&#92;geq N' class='latex' /> then <img src='http://s0.wp.com/latex.php?latex=%7Ca_n-a%7C%3C%5Cepsilon.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a_n-a|&lt;&#92;epsilon.' title='|a_n-a|&lt;&#92;epsilon.' class='latex' /></li>
<p>If we include the quantifier, then we will not make this mistake. Instead, we will write the correct negation, which is</p>
<li>There exists some <img src='http://s0.wp.com/latex.php?latex=%5Cepsilon%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;epsilon&gt;0' title='&#92;epsilon&gt;0' class='latex' /> such that for every <img src='http://s0.wp.com/latex.php?latex=N%5Cin%5Cmathbb%7BN%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N&#92;in&#92;mathbb{N}' title='N&#92;in&#92;mathbb{N}' class='latex' /> there exists <img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N' title='n&#92;geq N' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=n%5Cgeq+N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;geq N' title='n&#92;geq N' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%7Ca_n-a%7C%5Cgeq%5Cepsilon.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|a_n-a|&#92;geq&#92;epsilon.' title='|a_n-a|&#92;geq&#92;epsilon.' class='latex' /></li>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gowers.wordpress.com/3270/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gowers.wordpress.com/3270/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gowers.wordpress.com/3270/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gowers.wordpress.com/3270/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/gowers.wordpress.com/3270/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/gowers.wordpress.com/3270/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/gowers.wordpress.com/3270/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/gowers.wordpress.com/3270/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gowers.wordpress.com/3270/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gowers.wordpress.com/3270/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gowers.wordpress.com/3270/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gowers.wordpress.com/3270/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gowers.wordpress.com/3270/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gowers.wordpress.com/3270/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gowers.wordpress.com&amp;blog=1659011&amp;post=3270&amp;subd=gowers&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://gowers.wordpress.com/2011/10/02/basic-logic-relationships-between-statements-negation/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/24ee673de88d3b72ddf2772a8e49008d?s=96&#38;d=identicon" medium="image">
			<media:title type="html">gowers</media:title>
		</media:content>
	</item>
	</channel>
</rss>
