Here are some two-dimensional scatter plots. It does not matter what the x and y axes represent – they could be anything.
At least one of these plots is showing randomly generated data. How many plots were randomly generated, and how many show real trends (such as a curve or clustering)?
(a) Only one is randomly generated, and three show real trends.
(b) Two are randomly generated, and two show real trends.
(c) Three are randomly generated, and one shows a real trend.
(d) All four of the plots are randomly generated.
This is a tricky one, so I will help you out.
What if I add a splash of color to the top left plot? Can you see the clustering now?
What about a trendline through the bottom-right plot? Looks like a pretty good fit.
So, the answer is (b), right? Two plots are randomly generated, and two are real trends?
The answer is (d). All four of the plots were randomly generated.
Given that last edition’s cognitive biases theme was the illusion of causality, you really should have seen it coming! But only 12 percent of respondents answered correctly.
Does that mean that almost 90 percent are seeing phantom patterns?
Yes … and no.
It is true that we are hardwired to see patterns in data and, therefore, really bad at identifying randomness. In fact, though, the two patterns shown above (the clustering and the parabola) could easily have fooled statistical tests by a computer. The clustering is strong, and the goodness-of-fit of the parabola is high. Unfortunately, the trends were just coincidences.
So, once again: Do not feel too bad!