I have now closed the polls in the second mathematical writing experiment. Here are the results. I have also published the comments on the first and second experiments, which shed further light on the results and on (some) people’s reasons for voting the way they did.
The results were complicated slightly by the fact that after a couple of days the post was linked to from the front page of Hacker News, and suddenly the number of people who voted more than tripled in a few hours. Furthermore, there was a bit of discussion about the polls, so it is not clear that the votes were completely independent. Also, the profile of an average Hacker News reader is probably somewhat different from the profile of an average reader of this blog.
Fortunately, I had recorded the votes shortly before this happened, so below we (“we” means Mohan Ganesalingam and I) present both sets of results. As it happens, the proportions didn’t change too much. We begin with some bar charts. U stands for “undergraduate”, G for “graduate”, C for “computer” and D for “don’t know”. The portion coloured in blue represents people who claimed to be sure that they were correct, and the portion coloured in red represents those who were unsure.
At the end of the post we give the exact numbers.
A brief remark before we present the results is that none of the three “contestants” was explicitly trying to write proofs that would appear as human as possible. The two humans were asked to provide proofs and not told why. And the program was designed to produce passable write-ups, but not to fool people into thinking that those write-ups were written by a human. (There are some easy improvements that could have been made if we had intended to do that, but we did not originally envisage carrying out this second experiment.)
The general picture can be summarized as follows.
1. The computer was typically identified by around 50% of all those who voted.
2. Typically around half of those were confident that they were correct, and half not so confident.
3. A non-negligible percentage of respondents claimed to be sure that a write-up that was not by the computer was by the computer.
The results in bar-chart form
There is quite a lot to say about these results, and also about the program. Most of the rest of this post is written jointly with my collaborator Mohan Ganesalingam (as indeed were the posts with the two experiments — we chose the wording very carefully), but first let me say a little about who Mohan is.