Author: Edmund Bennett

Beyond the Pig: Data-Driven Risk Management for Challenging Pipelines

In an era where data is abundant, the pipeline industry faces a unique challenge: how to transform vast, complex datasets into actionable insights. Our expert, Edmund Bennett, explains in more detail how ROSEN is tackling this challenge head-on by using analytics and machine learning techniques to uncover structural patterns and risk factors that can enrich traditional analyses. Used in combination with ROSEN’s deep knowledge of pipeline integrity and threat mechanisms, these techniques offer a new approach to global pipeline benchmarking and allow operators to understand their own expectations for the lifespan of this critical infrastructure.

Why data-driven integrity management matters

Effective pipeline integrity management depends on a wide range of data, including inspection results, geospatial information, material properties, and anomaly reports. This data is essential for identifying, assessing, and mitigating risks, as well as making informed decisions about maintenance and asset performance. 

ROSEN’s Integrity Data Warehouse is a unique and powerful resource for the pipeline industry, containing almost 83 million inspected joints from more than 30,000 evaluated in-line inspections. It provides the vast and comprehensive dataset described above, including inspection data, reports, and geospatial data relevant to pipeline integrity threats, and offers an unparalleled, broad picture of the condition of a significant portion of the world's piggable pipeline assets. 

The complexity and volume of this data can be difficult to interpret, especially when it comes to uncovering patterns across thousands of joints and multiple pipeline systems. At the same time, complex data sets offer the opportunity to extract powerful insights for pipeline integrity management. Dimensionality reduction techniques, such as Uniform Manifold Approximation and Projection (UMAP), help simplify this complexity and assist integrity managers in better understanding risks and attributing specific factors to the pipeline networks they manage.

Portrait of Edmund Bennett
As the industry continues to digitize and scale its data infrastructure, techniques like UMAP will play a critical role in transforming data into strategic insights – helping operators manage risk, optimize maintenance, and ensure long-term pipeline integrity.
Edmund Bennett, Principal Data Scientist, ROSEN Group

Simplifying complexity with UMAP

UMAP is a powerful machine learning technique used for dimensionality reduction – a process that simplifies complex datasets while preserving their underlying structure. In the context of pipeline analytics, UMAP enables engineers and data scientists to visualize relationships between thousands of pipeline joints, detect clusters of similar features, and uncover hidden patterns that may indicate risk. To visualize the effectiveness of UMAP, ROSEN applied this technique to a subset of carefully pre-processed data from 4,000 inspections within the Integrity Data Warehouse, which includes pipeline and joint attributes such as construction year, joint length, pipe grade, diameter, wall thickness, external coatings, and external metal loss anomaly data.

The resulting visualization reveals clear patterns: Similar joints are clustered together, while dissimilar ones form distinct regions. UMAP often groups joints from the same pipeline without being provided asset identifiers, uncovering latent structural similarities. The larger-scale structure can be analyzed to determine groups of pipelines whose risk is potentially greater. The upper left corner of the plot displays joints typically exhibiting more severe corrosion levels, indicating a region of higher risk within the UMAP plot. Pipelines driving network-level risk for an operator may then be easily identified. Crucially, the incorporation of maintenance and other operational cost data into UMAP analysis then enables the operator to determine assets and specific joints whose operational costs are incommensurate with their integrity state, and enables focus on outliers driving risk across an asset portfolio.

Two-dimensional UMAP of a subset of the dataset for a specific operatorFigure 1: Two-dimensional UMAP of a subset of the dataset for a specific operator. Each point represents a joint, with color indicating individual pipeline assets. The background contour and heatmap represents the average probability of exceedance of 20% wall thickness external corrosion anomalies of joints in that region, with the tri-point indicating pipe joints with the highest probability of external corrosion, driving relevant risk mitigation such as CIPS or in-line inspection. Joints of the same pipeline are often grouped together, indicating their similarity. Meanwhile, the larger-scale structure can be analyzed to determine groups of pipelines whose risk is potentially greater.

Using UMAP for similarity matching and clustering allows ROSEN to uncover actionable insights for asset management, allowing direct and quick comparisons between joints and assets; it also highlights the need for richer, contextual data to fully understand and characterize pipelines at the joint level and understand risk. As the industry continues to digitize and scale its data infrastructure, techniques like UMAP will play a critical role in turning complex data into actionable intelligence. A modern approach to asset management, complemented by the techniques described here, will allow the industry to proactively address threats in an informed and coordinated manner by targeting resources for inspection and maintenance at the areas that really carry the highest risk. They will also enable operators to improve the efficiency of workflows, reduce data-silos, and simplify manual reviews. To achieve these objectives, operators should consider the following changes:

  • Adopt centralized systems for pipeline integrity data to enable both easy comparisons between assets and network-level summaries.
  • Adopt a holistic approach, bringing pipeline integrity and pipeline commercial viability together.
  • Develop network-level summaries of piggable and unpiggable asset risk, and consider these crucial safety metrics at the management level.
Portrait of Edmund Bennett

Edmund Bennett 

Principal Data Scientist, ROSEN Group

Contact me
Close up of a hand holding a cell phone on which the facet newsletter can be seen.

Not yet registered to facets?

Register now if you would like to see more stories like this and receive the latest news and updates.
Read more