A couple of months ago, I can’t remember precisely how, I became aware of a book called How I Wish I’d Taught Maths, by Craig Barton, that seemed to be highly thought of. The basic idea was that Craig Barton is an experienced, and by the sound of things very good, maths teacher who used to take a number of aspects of teaching for granted, until he looked into the mathematics-education literature and came to realize that many of his cherished beliefs were completely wrong. Since I’ve always been interested in the question of how best to teach mathematics, both because of my own university teaching and because from time to time I like to pontificate about school-level teaching, I decided to order the book. More surprisingly, given my past history of buying books that I felt I ought to read, I read it from cover to cover, all 450 pages of it.

As it happens, the book is ideally designed for people who *don’t* necessarily want to read it from cover to cover, because it is arranged as follows. At the top level it is divided into chapters. Each chapter starts with a small introduction and thereafter is divided into sections. And each section has precisely the same organization: it is divided into subsections entitled, “What I used to believe”, “Sources of inspiration”, “My takeaway”, and “What I do now”. These are reasonably self-explanatory, but just to spell it out, the first subsection sets out a plausible belief that Craig Barton used to have about good teaching practice, often ending with a rhetorical question such as “What could possibly be wrong with that?”, the second is a list of references (none of which I have yet followed up, but some of them look very interesting), the third is a discussion of what he learned from the references, and the last one is about how he put that into practice. Also, each chapter ends with a short subsection entitled “If I only remember three things …”, where he gives three sentences that sum up what he thinks is most important in the chapter.

One question I had in the back of my mind when reading the book was whether any of it applied to teaching at university level. I’m still not sure what I think about that. There is a reason to think not, because the focus of the book is very much on school-level teaching, and many of the challenges that arise do not have obvious analogues at university level. For example, he mentioned (on page 235) the following fascinating experiment, where people were asked to do the following multiple-choice question and then justify their answers.

*Which of these values could not represent a probability?*

A. 2/3

B. 0.72315

C. 1.46

D. 0.002

Let me quote the book itself for a discussion of this question.

Surely the rule

probabilities must be less than or equal to 1is about as straightforward as it gets in maths? But why, then, did 47% of the 5000+ students who answered this question get it wrong?A few students’ explanations reveal all:

I think B because it’s just a massive decimal and the rest look pretty legit. I also don’t see how a number that big could be correct.

I think B because you wouldn’t see this in probability questions.

I think D because you can’t have 0.002 as an answer because it is too low.If students are only used to meeting `nice-looking’ probabilities during examples and practice questions, then it is little surprise they come a cropper when they encounter strange-looking answers.

Could one devise a university-level question that would catch a significant proportion of people out in a similar way? I’m not sure, but here’s an attempt.

*Which of the following is not a vector space with the obvious notions of addition and scalar multiplication?*

A. The set of all complex numbers.

B. The set of all functions from to that are twice differentiable.

C. The set of all polynomials in with real coefficients that have as a factor.

D. The set of all triples of integers.

E. The set of all sequences such that and .

I think at Cambridge almost everyone would get this question right (though I’d love to do the experiment). But Cambridge mathematics undergraduates have been selected specifically to study mathematics. Perhaps at a US university, before people have chosen their majors, people might be tempted to choose another option (such as B, because vector spaces are to do with algebra and not calculus), while not noting that the obvious scalars in D do not form a field. Or perhaps they wouldn’t like A because the scalar field is the same as the set of vectors (unless, that is, they thought that the obvious scalars were the real numbers).

More generally, I feel that there are certain kinds of mistakes that are commonly made at school level that are much less common at university level simply because those who survive long enough to reach that stage have been trained not to make them. For example, at university level we become used to formal definitions. Once one is in the habit of using these, deciding whether a structure is a vector space is simply a question of seeing whether the definition of a vector space applies, rather than thinking “Hmm, does that look like the vector spaces I’ve met up to now?” We also become part of a culture where it is common to look at pathological, or at least slightly surprising, examples of concepts, and so on.

Another reason I decided to read the book was that I have certain prejudices about the teaching of mathematics at school level and I was interested to know whether they would be reinforced by the book or challenged by it. This was a win-win situation, since it is always nice to have one’s prejudices confirmed, but also rather exhilarating to find out that something that seems obviously correct is in fact wrong.

A prejudice that was strongly confirmed was the value of mathematical fluency. Barton says, and I agree with him (and suggested something like it in my book Mathematics, A Very Short Introduction) that it is often a good idea to teach fluency first and understanding later. More precisely, in order to decide whether it is a good idea, one should assess (i) how difficult it is to give an explanation of why some procedure works and (ii) how difficult it is to learn how to apply the procedure without understanding why it works.

For instance, suppose you want to teach multiplication of negative numbers. The rule “If they have the same sign then the answer is positive, and if they have different signs then the answer is negative” is a short and straightforward rule, but explaining why -2 times -3 should equal 6 is not very straightforward. So if one begins with the explanation, there is a big risk of conveying the idea that multiplication of negative numbers is a difficult, complicated topic, whereas if one gives plenty of practice in applying the simple rules, then one gives one’s students fluency in an operation that comes up in many other contexts (such as, for instance, multiplying by ), and one can try to justify the rule later, when they are comfortable with the rule itself. I remember enjoying the challenge of thinking about why the rule for dividing one fraction by another was correct, but that was long after I was happy with using the rule itself. I don’t remember being bothered by the lack of justification up to that point.

As an example in the other direction, Barton gives that of solving linear equations. The danger here is that one can learn a procedure for solving equations such as , get good at it, and then be completely stuck when faced with an equation such as . Here a bit of understanding can greatly help. Barton advocates something called the balance method, where one imagines both sides of the equation on a balance, and one is required to make sure that balance is maintained the whole time. I think (but without too much confidence after reading this book) that I would go for something roughly equivalent, but not quite the same, which is to stress the rule *you can do the same thing to both sides of an equation* (worrying about things like squaring both sides or multiplying by zero later). Then the problem of solving linear equations would be reduced to a kind of puzzle: what can we do to both sides of this equation to make the whole thing look simpler?

That last question is related to another fascinating nugget that is mentioned in the book. Barton gives an example of a question concerning a parallelogram ABCD, where the angle at A is 105 degrees. The line BC is extended to a point E, which is then joined by an additional line segment to D, and the angle CED is 30 degrees. The question is to prove that the triangle CED is isosceles.

Apparently, this question is found hard, because one cannot achieve the goal in one step. Instead, one must observe that the angle of the parallelogram at C is also 105 degrees, from which it follows that the angle ECD is 75 degrees. And from that it follows that the angle EDC is 75 degrees as well, and the problem is solved.

But the interesting thing is that if you change the question to the more open-ended question, “Fill in as many angles in this diagram as you can,” then many people who found the goal-oriented version too hard have no difficulty in filling out all the angles in the diagram and therefore noticing that the triangle CED is isosceles.

The lesson I would draw from this with the equations question is that instead of asking for a solution to the equation , it might be better to ask “See whether you can make the equation look simpler by doing something to both sides. If you manage, see if you can then make it even simpler. Keep going until you have made it as simple as you can.” This would of course come after they had already seen several examples of the kind of thing one can do to both sides of an equation.

Barton isn’t content with just telling the reader that certain methods of teaching are better than others: he also tells us the theory behind them. Of particular importance, he claims, is the fact that we cannot hold very much in our short-term memory. This was music to my ears, as it has long been a belief of mine that the limited capacity of our short-term memory is a hugely important part of the answer to the question of why mathematics looks as it does, by which I mean why, out of all the well-formed mathematical statements one could produce, the ones we find interesting are those particular ones. I have even written about this (in an article entitled Mathematics, Memory and Mental Arithmetic, which unfortunately appeared in a book and is not available online, but I might try to do something about that at some point).

This basic point informs a lot of the discussion in the book. Consider, for example, a question that asked you to find the perimeter of a rectangle that had side lengths 2/3 and 3/5. This could be a great question, but it is very important to ask it at the right point in the students’ development. If you ask it before they are fluent at adding fractions and at working out perimeters of rectangles, then the amount they have to hold in their heads may well exceed their cognitive capacity: they need to store the fact that you have to add the two lengths, and multiply by 2, and put both fractions over a common denominator. It is to avoid this kind of strain that attaining fluency is so important: it literally makes it easier to think, and in particular to solve the kind of interesting problems we would all like them to be able to solve. Barton absolutely doesn’t dispute the value of interesting problems that mix different parts of mathematics — he just argues, very convincingly, that one has to be careful when to introduce them.

An idea he discusses a lot, and that I think might perhaps have a role to play in university-level teaching, is what he calls diagnostic questions, and in particular low-stakes diagnostic tests. These typically take the form of a short multiple-choice quiz, and he tries very hard to create a classroom culture where people understand that the purpose of the quiz is not assessment — the quizzes do not “count” for anything — but a tool to help learning, and in particular to help diagnose problems with understanding.

What makes these questions “diagnostic” is that they are carefully designed in such a way that if you have a certain misconception, then you will be drawn towards a certain wrong answer. That is, the wrong answers people give are informative for the teacher, rather than merely wrong. Here, for example, is a question that fails to be diagnostic followed by a modified version that succeeds.

*A triangle has one side of length 6 and two sides of length 5. What is its area?*

A. 8

B. 11

C. 12

D. 15

E. 20

A. 6

B. 12

C. 15

D. 16

E. 24

F. 30

With the second set of choices, each answer has a potential route that one can imagine somebody taking. To obtain the answer 6, one chops the triangle into two right-angled triangles, each of height 4 and base 3, calculates the area of one of them, and forgets to double it. The correct answer is 12. To obtain 15, one takes the formula “half the height times the base” but substitutes in 5 for the height. To obtain 16 one calculates the perimeter. To obtain the answer 24 one takes the height times the base. And to obtain the answer 30 one multiplies the two numbers 6 and 5 together (on the grounds that “to calculate the area you multiply the two numbers together”). Thus, wrong answers yield useful information. With the first set of answers, that just isn’t the case — they are much more likely to be the result of pure guesswork.

It’s worth mentioning that Terence Tao has created a number of multiple-choice quizzes on university-level topics. He has also blogged about it here. They are not exactly diagnostic in the sense Barton is talking about, but one could imagine trying to make them so.

Barton uses these diagnostic tests to get a much clearer picture of what his class already understands, before he launches into the discussion of some new topic, than he would by simply asking questions to the class and getting answers from a few keen students. If he diagnoses a fairly serious collective misunderstanding, then he will spend time dealing with that, rather than pointlessly trying to build on shaky foundations.

I’m jumping around a bit here, but a semi-counterintuitive idea that he advocates, which is apparently backed up by serious research, is what he calls pretesting. This means testing people on material that they have not yet been taught. As long as this is done carefully, so that it doesn’t put students off completely, this turns out to be very valuable, because it prepares the brain to be receptive to the idea that will help to solve that pesky problem. And indeed, after a moment of getting used to the idea, I found it not counterintuitive at all. In fact, it resonates very strongly with my experience as a research mathematician: I find reading other people’s papers very difficult as a rule, but if they can help me solve a problem I’m working on, a lot of that difficulty seems to melt away, because I know exactly what I want, and am looking out for the key idea that will give it to me.

There’s a great section on the use of artificial “real-world” problems. I think he would agree with me about Use of Maths A-level. As someone he quotes says, “Students are constantly on their guard against being conned into being interested.” An example he discusses is

*Alan drinks 5/8 of a pint of beer. What fraction of his drink is left?*

If the entire point of the exercise is to gain fluency with subtracting fractions, then he advocates just cutting the crap and asking them to calculate 1-5/8, which I agree with 100%.

If, on the other hand, it is intended as an exercise in stripping away the unnecessary real-world stuff and getting at the underlying mathematics, then he has interesting things to say (later in the book than this section) about the relationship between what he calls the surface structure and the deep structure. The former is to do with the elements of the question that present themselves directly to the student — in this case Alan and the beer — while the deep structure is more like the underlying mathematical question. To train people to uncover the deep structure, it is very important to give them pairs of questions with the same surface structure and different deep structures, and vice versa. Otherwise, they may learn a procedure that works for lots of similar examples and lets them down as soon as a new example comes along with a different deep structure.

There is lots more in the book — obviously, given its length — but I hope this conveys some of its flavour. The only negative thing I can think of to say is that the word “flipping” is overused — the sentence “Teaching is flipping hard” occurs several times, when once would be enough for one book. But if you’re ready for a bit of jocularity of that kind, then I recommend it, as I found it highly thought provoking. I don’t yet know what the result of that provocation will be, but I’m pretty sure there will be one.

December 22, 2018 at 4:20 pm |

[This comment may show up twice by accident – WordPress did something odd]

On how to get students to think about a problem – for example, “fill in as many angles as you can” rather than “show that the triangle is isosceles” – is there any parallel with the ways in which good theorem-proving programs work? The program you prepared with Mohan Ganesalingam a few years ago was I think designed to think like a human mathematician. Obviously that was working at a much more sophisticated level, but did it use comparably indirect approaches along the lines of “Never mind the problem set, let’s work out everything we can about this diagram/series/topological space and then see where that takes us”?

On using wrong answers to multiple-choice questions to diagnose errors and provide appropriate help, something very similar was in vogue about 40 years ago under the name of programmed learning. There were books like that: “What is the answer? If A, go to page 32 [which would take you to the next problem, if A was the right answer]. If B, go to page 137 [which would explain the mistake you had made, set you another problem on the same point, then if you got that one right send you to page 32 to rejoin the main stream]. If C, go to page 48 [where your mistake would be explained, etc.]” There were also machines with the pages on rolls of film and keyboards to select your answers (the film would scroll back and forth in response). Now I think we would achieve the same effect with less clanking and whirring.

December 22, 2018 at 5:34 pm

There is indeed a parallel, and it’s one of the reasons I found that particular passage so interesting. There is a distinction between forwards reasoning, where you explore the consequences of what you know, and backwards reasoning, where you use what you know to reduce what you are trying to prove to something easier. For what Mohan and I were doing, backwards reasoning tended to be the mode of choice, because in practice that tends to lead to less search. But that isn’t always the case: sometimes there are lots of potential reductions of what you are trying to prove, without any of them standing out as being particularly good, while in the other direction there is the possibility of making an “observation” that, once made, is clearly helpful.

I’m amused by what you say about those books. I read one myself, which I found in my school library, about how a radio works. I remember showing it to my father (this would have been in about 1978, so indeed 40 years ago), whose reaction was “That’s a good account of how radio worked about thirty years ago.” Even if it was out of date, it conveyed to me the basic idea of amplitude modulation, which has stuck with me ever since.

December 23, 2018 at 10:51 am |

It would be an interesting experiment to ask that vector space question at some point during the IB Linear Algebra course. The results might well be surprising (or not). (Craig Barton suggests some approaches to how to ask such a diagnostic question; it may be a little harder in a lecture hall, though.)

Eric Mazur had a similar experience when teaching Physics at Harvard; he gave a lecture about it, which is available here: https://youtu.be/WwslBPj8GgI

Another couple of talks which might be of interest: Sandra Laursen on a research project on the impact of IBL (Inquiry Based Learning): https://youtu.be/nEBkk1QfA0k and Michael Starbird asking what we want our students to get out of our math(s) courses: https://youtu.be/VVSaNNrkeEM

Best wishes

December 24, 2018 at 5:45 am |

Thank you for sharing this, Professor Gowers! Much of this blog post resonates with me.

I had come to more or less the same conclusion regarding short-term memory based on my undergraduate education. While tutoring my friends in the Calculus and Linear Algebra courses, I found that the most common problem they faced was being able to hold many ideas in their mind at the same time. I believe that the large number of problems that I had to solve throughout my schooling helped develop my fluency in basic manipulations. I like to think that it frees the memory by pushing those skills to an instinctive level.

Regarding diagnostic questions, I have felt that some of the entrance exams in India for undergraduates and graduates do design their papers in such a manner. The purpose may only be to weed away the less competent students rather than to improve their education, but I’m sure it is possible to design such test papers for use in the university classrooms. I cannot say how much work it will involve on the part of the instructors, though. . .

Luckily, Barton’s book is available in my country at a reasonable price, and I will be buying it at the earliest.

December 26, 2018 at 1:29 am |

[…] by /u/alexeyr [link] […]

December 26, 2018 at 4:19 pm |

This part is fascinating to me: “Of particular importance, he claims, is the fact that we cannot hold very much in our short-term memory. This was music to my ears, as it has long been a belief of mine that the limited capacity of our short-term memory is a hugely important part of the answer to the question of why mathematics looks as it does, by which I mean why, out of all the well-formed mathematical statements one could produce, the ones we find interesting are those particular ones. I have even written about this (in an article entitled Mathematics, Memory and Mental Arithmetic, which unfortunately appeared in a book and is not available online, but I might try to do something about that at some point). ”

About a month ago, I wrote an to my son’s physics teacher, claiming that his school should offer a course, or at least a few lectures, on “how to think” — the example I gave was “how to think about things that don’t quite fit in your brain” and the specific example I gave was a geometry proof where things got much easier if you just started writing down all the facts that were obvious.

Not claiming my thoughts were original (clearly not) but I just bought Barton’s book to see what else is in there, and would love to see your article if you can somehow make it available.

Thanks for a fascinating weblog.

December 26, 2018 at 5:07 pm |

on the question of diagnostic quizzes, how does the teacher predict the reasoning the student may have used to arrive at their choice by looking at the students choice? is there only way of getting to that choice?

bit on a tangent, how do you go about picking the wrong(?) choices

for such a diagnostic question?

December 26, 2018 at 5:46 pm

The point is to not worry about the reasoning the student used. The probably that a student can score well on a well designed diagnostic test using mostly faulting reasoning is extremely low. Well designed means that you give questions where the obvious guess won’t work.

Designing a good diagnostic test for a particular course is not easy. It requires someone who has not just a lot of experience teaching the course but has already devoted a lot of attention to diagnosing why students taking that course do poorly. Without firsthand experience, it is *impossible* to know what to look for. I say this, because I went through this myself. I taught university level math in the same naive way that Barton did. for many years. Then my department hired an instructor (Jerry Epstein, now passed away) who knew the math education research about why so many students and adults fall short in math and who had devoted a lot of effort designing and validating diagnostic tests. The questions were *below* high school level, but even at top US colleges, as many as 10% of the students (but not necessarily math majors) did poorly. He explained to us how this all happened. We of course started administering the test to our students, which confirmed everything he said. After this, we redesigned our diagnostic test (which included a careful validation process) and tried our best to change how we taught our precalculus and calculus courses.

I also recommend looking at the first chapter of Calculus by Hughes-Hallett which has excellent precalculus problems, which I feel are good questions testing a student’s readiness to take calculus. Here is one of my favorite ones:

Find the exactly value of arccos (cos 4).

If you view this as a trick question, you are missing the point. Students should be learning math as a meticulous logically rigorous process and to solve this problem in that fashion. If they answer “4”, they have not learned how to do math correctly.

December 26, 2018 at 7:26 pm

I tried to give some indication of this in the post. The idea is to predict (using one’s experience of teaching) the incorrect reasoning that people are likely to make, and to give the answers that that incorrect reasoning yields. To give a very simple example, if you asked them to subtract 28 from 80, then a good choice of wrong answer would be 62, which one obtains by noting that it ends with a 2 and that 8-2=6. (So actually, contrary to what Deane Yang writes, the point is very much to worry about the reasoning that the students use.)

I don’t think designing good diagnostic questions and answers is all that easy. But Barton (I think) has created a website devoted to such questions, so teachers don’t all have to do it for themselves. Unfortunately you have to register to look at the questions, but at least it is free.

December 26, 2018 at 6:09 pm |

As for “Alan drinks 5/8 of a pint of beer. What fraction of his drink is left?”, here are some of my thoughts:

First, we use such questions as diagnostics (in addition to straightforward arithmetic and algebraic computation problems) to see whether a student has been properly trained or not in his earlier math classes. If they do poorly on problems which require properly interpreting the meaning of the text, then they are not ready for university level math.

Second, I don’t see how using “pairs” is at all sufficient. Certainly, the first problems given to a student should be straightforward. However, after that, almost *every* problem should be designed so that it cannot be answered using a superficial reading. What is true is that every so often you should throw in a problem where the obvious answer is the correct one. The latter is mostly for building a student’s confidence that they really can understand correctly what is being asked. If a student successfully learns how to find the deep structure of a problem, then a problem where the superficial and deep structures agree and the answer is obvious can be quite unsettling to the student.

We do not want to teach a student only to distinguish superficial and deep structure. This feeds the view that math problems are “trick questions”. We want every student to immediately step around the superficial structure and look *only* for the deep structure. A question is not a trick question, if it can be solved by a straightforward rigorous reading of the question. The goal is to make them as fluent in dong this as it is for us (which is to say that it will never become easy). Only then should we say that a student knows how to do math.

December 26, 2018 at 7:31 pm

On your last point, Barton has a collection of questions he calls SSDD (same surface structure, different deep structure). They aren’t trick questions at all — just questions that can’t be done on autopilot because one has to do exactly what you advocate — that is, get behind the surface structure to see what the deep structure is.

For example, instead of giving lots of questions where one has to work out one of the sides of a right-angled triangle given the other two sides, one could have questions where you have to work out the area, the altitude when the hypotenuse is the base, the perimeter, and so on, so that students don’t go into “It’s a right-angled triangle, so I must use Pythagoras” mode.

December 28, 2018 at 8:02 pm |

FWIW, at most US universities, I think that most students who haven’t chosen a major yet won’t know what a “vector space” or a “field” is.

December 29, 2018 at 1:23 pm

Seconded, and you can also insert “Canadian”.

Also, at a reasonable number of UK universities, it isn’t until the 2nd year or later (of the “major”) that students learn what a field is.

January 4, 2019 at 2:02 am |

Something of a different point of view, but related. https://andrewgelman.com/2018/07/27/makes-robin-pemantles-bag-tricks-teaching-math-great/

January 11, 2019 at 1:03 am |