In a Nutshell:

In this third part of our Cognitive Bias series, Michael Smith introduces the illusion of causality bias and how it relates to Aaron Ramsey and celebrity deaths as well as to pipeline integrity management.

In Cognitive Biases #2, I introduced my favorite English mathematician, Thomas Bayes (1701-1761). I also talked about one of the most endemic cognitive biases we suffer: the confirmation bias.

Today, I’d like to talk about another bias: the illusion of causality. But before I do that, let’s embark on a long, rambling introduction. I’m not busy.

As it turns out, there’s another English Thomas whom I admire very much – the late Thomas James Whittaker MBE (1898-1956). Far from being a mathematician, Whittaker was a football (soccer!) player and manager. Quite a successful one, in fact: in the post-war period, he led Arsenal Football Club to two league titles and an FA Cup victory.

I bring this up because Thomas Whittaker is my great-great-uncle-in-law. It’s not a claim to fame by any stretch of the imagination, but it’s as good a reason as any to be an Arsenal supporter. At the very least, it explains why I’m so unashamed of my Arsenal mug (despite working in an office full of dedicated Newcastle fans).

Indeed, for a number of years now, I’ve followed Arsenal FC attentively, watching the highs and lows (mostly lows) and seeing some giants of the game come and go (mostly go). Among the most painful departures were Thierry Henry (twice!), Cesc Fabregas, Robin van Persie, Alexis Sanchez and the chosen one, Nicklas Bendtner.

Last summer also saw the exit of long-serving Welsh midfielder Aaron Ramsey. Ramsey spent eleven seasons with the club, and they were not without incident. After suffering a horrendous broken leg at the age of 19, Ramsey fought his way back into the first team and made over 300 appearances for Arsenal over the next decade. During this time, he scored 64 goals, one of which ended the club’s infamous 3,283-day trophy drought with an FA Cup win. Well done, Aaron.

desk question

Alright, alright.

Well, did you know that around a third of Aaron Ramsey’s Arsenal goals were followed closely by a high-profile celebrity death?

Victims of the “Curse of Aaron Ramsey” included Steve Jobs, Whitney Houston, Robin Williams, David Bowie, Alan Rickman, Nancy Reagan, Sir Roger Moore and Professor Stephen Hawking.

When quizzed about it, Ramsey described the curse as a “ridiculous rumor,” but many were unconvinced.

Believe it or not, spooky patterns such as this are commonplace. Over the past 20 years, for instance, the number of Nicolas Cage films has correlated with the number of people who drowned by falling into a pool, while the age of Miss America has correlated eerily well with the number of murders by steam, hot vapors and hot objects. And there are many more examples (not all of which involve death).

What’s happening here? Can those who died by becoming tangled in their bedsheets really blame their demise on per capita cheese consumption?

desk question

No! So, before you grab your rabbit’s foot, start bubble-wrapping your mirrors or – heaven forbid – knock on wood (you know who you are), it’s worth remembering that correlation is not causation.

I repeat, correlation is not causation.

In fact, to avoid the wrath of any statistically-minded readers, I’ll rephrase this slightly. Correlation is not necessarily causation.

Let’s imagine that we have two events, A and B, and they’re correlated (that means they consistently coincide with one another). In some cases, this correlation occurs because A is indeed the cause of B (left of Figure 1). This is true, for instance, when A is the consumption of contaminated food, and B is its prompt and violent expulsion. Urgh.

desk question

However, direct causality is not the only reason for correlation. What if A and B are unrelated, but both are triggered by a third variable, C (center of Figure 1)?

A classic example is the strong correlation observed between murder rate and ice cream sales in large cities. Though unconnected, both increase during the summer months and decrease during the winter months (when all the murderers stay inside to warm their toes by the fire). Variables like C are known as confounders, and – as we’ll see later – they can really muddy the waters in a statistical analysis.

A final possibility, dare I say it, is coincidence (right of Figure 1). Events can coincide for no reason whatsoever. I think we can safely put the “Curse of Aaron Ramsey” in this final category. The jury’s still out on Nicolas Cage.

desk question

Figure 1: Correlation vs. causation

Strange though they seem to us, coincidences are natural and abundant. Given the number of events in the world (and hence the number of opportunities for unrelated events to coincide), it would actually be far stranger if coincidences never happened at all.

Humans, however, have evolved to model the world through causal reasoning. From a Darwinian perspective, it just didn’t pay for our ancestors to write things off as random. They needed a more sophisticated understanding of cause and effect in order to survive.

desk question

Unfortunately, this has resulted in an unshakeable tendency to see causal relationships where they simply don’t exist. This illusion of causality is, in fact, one of our most troublesome cognitive biases. It affects politics, medicine, entertainment, science and technology – from sports betting to the stock market, from crash diets to climate change.

desk question

Great question.

Clearly, this bias affects the pipeline industry, too. Here’s an example that came to mind recently.

Imagine you’re designing a new pipeline, and you need to select an external coating. It’s a straight fight between asphalt and fusion bonded epoxy (FBE), both of which have been used on your system in the past.

Historically, the greatest threat to your pipeline network has been external corrosion, and you perform regular in-line inspections to detect this threat. It makes sense to look at the accumulated corrosion damage on your pipelines to see if the coating type has made any difference.

Feeling clever, you decide to look at how the probability of exceedance (PoE) varies with the coating type. PoE is essentially a probability-of-failure estimate that considers the depths of corrosion anomalies as measured by in-line inspection. It’s a simple proxy variable for the condition of a pipeline. Figure 2 shows the distribution of PoE values for the pipelines in your network. The box plots show the positions of the minimum, lower quartile, median, upper quartile and maximum values.

desk question

Figure 2: Probability of exceedance (PoE) vs. coating type

What do you know! The FBE-coated pipelines have much lower PoE values than the asphalt-coated pipelines. That means they tend to have fewer and shallower corrosion anomalies. Clearly, FBE is a superior coating.

You run to your manager, excited at your discovery and ready to insist on a policy that all new pipelines be coated with FBE. But, to your dismay, your manager is unimpressed.

desk question

You have, sadly, fallen victim to the illusion of causality. By performing a univariate analysis, you have ignored all of the other variables affecting the condition. Most notably, you have forgotten about age (Figure 3).

desk question

Figure 3: Age vs. coating type

Ah.

On average, the asphalt-coated pipelines are around 25 years older than the FBE-coated pipelines. It stands to reason that older pipelines will tend to be in a poorer condition, irrespective of the coating type.

It is, of course, still possible that coating type affects corrosion susceptibility, but clearly the age variable is confounding your judgement. And that’s just one variable. What about the local environment, cathodic protection, proximity to roads and rivers, and historical repairs? The list goes on.

This is a somewhat simplistic example, but the point is clear. If we misinterpret the cause of a measured effect, we risk making poor decisions. With pipelines, poor decisions have consequences – whether it be high costs, loss of production or even loss of life.

What’s the moral of the story? Do the stats, and beware the illusion of causality.

Just another example of the stupidity of humans.

desk question