Here are some two-dimensional scatter plots. It does not matter what the x and y axes represent – they could be anything.


At least one of these plots is showing randomly generated data. How many plots were randomly generated, and how many show real trends (such as a curve or clustering)?

  1. Only one is randomly generated, and three show real trends.
  2. Two are randomly generated, and two show real trends.
  3. Three are randomly generated, and one shows a real trend.
  4. All four of the plots are randomly generated.

This poll has ended. Please check the results and the explanation below:


This is a tricky one, so I will help you out.

What if I add a splash of color to the top left plot? Can you see the clustering now?


What about a trendline through the bottom-right plot? Looks like a pretty good fit.


So, the answer is (b), right? Two plots are randomly generated, and two are real trends?


The answer is (d). All four of the plots were randomly generated.

But only 6 percent of respondents answered correctly.

Does that mean that more than 90 percent are seeing phantom patterns?

Yes … and no.

It is true that we are hardwired to see patterns in data and, therefore, really bad at identifying randomness. In fact, though, the two patterns shown above (the clustering and the parabola) could easily have fooled statistical tests by a computer. The clustering is strong, and the goodness-of-fit of the parabola is high. Unfortunately, the trends were just coincidences.

So, once again: Do not feel too bad!