*In a Nutshell: *

* Michael Smith of the ROSEN Group introduces the next installment of our cognitive biases series. This time, he focuses on confirmation bias: can you put your beliefs aside and be unbiased?*

For those who haven’t read part #1 of this miniseries ( the Bias Blind Spot), please allow me to introduce my pipeline operator friend.

Today he’s going to help me explain the **Confirmation Bias**. But before we do that, let’s take a trip back in time.

Not that far.

Just a few centuries.

You see, as a self proclaimed data scientist, I draw inspiration from a long line of great English mathematicians.

Firstly, there’s George Boole (1815-1864), creator of Boolean algebra, the lifeblood of digital electronics. Then there’s Charles Babbage (1791-1871), inventor of the first programmable computer, the Analytical Engine. Around the same time, there was Ada Lovelace (1815-1852), who published the world’s first algorithm intended for a computer. And in the 20th century, the great Alan Turing (1912-1954) became the father of computer science when he invented the Turing Machine.

Yes … you’re special too.

All of these great mathematicians paved the way for the digital revolution of the 21st century.

But my favorite English mathematician of all wasn’t thinking about computers. No, **Thomas Bayes** (1701-1761) was thinking about statistics.

OK, not a popular topic. But bear with me. Statisticians are people, too.

Bayes’ famous contribution to statistics was his theorem of conditional probability, an equation that describes how our prior beliefs about the world should be updated as new evidence is observed. Although Bayes wasn’t interested in computer science, the branch of mathematics that we now call **Bayesian Statistics** turns out to be extremely useful in machine learning applications.

So how exactly does it work?

As an example, let’s imagine that I have an icosahedral dice in my pocket.

A 20-sided dice.

If I took the dice out of my pocket right now and rolled it (without you seeing it), what would you say is the probability of rolling the number 20?

OK. For most rational individuals, 1/20 is the first answer that pops into their head. But why? You cannot possibly know. You haven’t even seen the dice.

That answer relies on a great many assumptions.

At the very least, you’ve assumed that the number 20 actually appears on the dice. In fact, you’ve almost certainly imagined that the integer 20 appears exactly once, with each of the integers 1-19 appearing exactly once on 19 other equivalently sized faces.

Furthermore, you’ve asserted that this dice has an equal chance of landing on any side. That means it has a uniform density. You’ve also assumed that I’d throw the dice randomly, with no bias towards any face.

Of course, you didn’t systematically list your assumptions like this. You simply extrapolated from your existing understanding of “dice,” and the answer came readily to mind.

In Bayesian terminology we call this a **prior belief**.

Now, let’s test how strongly you hold that belief.

Imagine that I’ve now rolled the dice once. I didn’t get a 20. You shouldn’t be surprised; your prior belief (1/20) tells you that this is far more likely than not. But suppose that after 50 attempts, I still haven’t rolled a 20. Are you surprised now? What about 100 rolls? 500?

After 500 rolls, it’s clear that something’s amiss. For the dice you imagined, the chance of 500 rolls without a 20 is less than 1 in 10^{11}. The rational conclusion: the dice is not the dice you imagined. You still don’t know what the dice looks like, but you know that the true probability of rolling a 20 is much lower than you first thought. Maybe the 20 face is much smaller than the others? Or maybe there’s no 20 at all?

Back to Bayes’ theorem.

When you started, your prior belief may have looked something like this:

Note that this is a distribution, not a point value. That’s because you’re uncertain about the true value. The higher the uncertainty, the wider the distribution. I’ve used a beta distribution here, with 1/20 (0.05) as its mean value.

Now, after each new piece of evidence, this distribution will evolve. We can model the evolution using Bayes’ theorem.

This is quite intuitive. We can see that after a single roll, you’ve barely changed your prior belief at all. With additional rolls, however, the distribution starts to move to the left and get narrower. After 500 rolls, you’ve abandoned your prior belief completely.

The rate at which you modify your belief is dependent on two things: the initial level of uncertainty and the strength of the evidence. If you’re confident in your prior belief, and you’re presented with weak evidence to the contrary, you’ll change your mind very slowly (if at all). If you’re skeptical about your prior belief, and you’re presented with strong evidence to the contrary, you’ll change your mind rapidly.

Bayes’ theorem only makes sense if we take all evidence into account and judge its strength objectively. In the aforementioned example, we cannot just ignore the dice rolls. Nor can we dismiss them as invalid or weak evidence without good reason.

That would be madness, right?

Indeed, it would, but this special kind of madness is something we all display on a daily basis when we exhibit the **confirmation bias**. Confirmation bias is our natural tendency to overweight evidence that supports our existing beliefs and underweight (or even ignore) evidence that does not. This helps to regulate our emotions and limit the feelings of anxiety that stem from “cognitive dissonance.”

Great question.

We’re all aware of mounting public pressure against pipelines in many parts of the world. Pressure groups cite the perceived dangers of pipelines to the public and (increasingly) the environment. But, as is so regularly the case in political matters, they often fail to acknowledge the other side of the argument, namely the economic importance of pipeline infrastructure. This is a very real manifestation of confirmation bias.

So how should we respond to the growing opposition to our industry?

Tempting, but maybe not.

The best thing we can do is to do our jobs effectively. In practice, that means establishing the true condition of assets and responding to threats appropriately when we find evidence of their existence.

The problem is that the evidence is not always forthcoming. Pipelines are long, complex and difficult to access, meaning our best option is usually to collect indirect evidence through online monitoring, surveys and in line inspections (ILI). Occasionally we perform direct examinations, but these are costly, and we cannot excavate everything.

The question is: do we collect and interpret evidence in an unbiased manner? Or do we, like our critics, seek out evidence that fits a pre existing narrative?

Let’s wrap it up with a thought experiment. Imagine you hold the following beliefs:

*“There is no risk of stress corrosion cracking on my pipeline."**“This damage was caused by microbiologically influenced corrosion.”**“Pipeline theft doesn’t happen in this part of the world.”*

How would these beliefs affect your approach to integrity management?

How might your approach change if you actively sought out evidence to contradict your beliefs?

** Next time, we’ll talk about randomness, and why you’re really bad at spotting it.**