Reliability Centered Maintenance (RCM) in Oil & Gas: What Conventional Maintenance Keeps Missing
If you’re a maintenance engineer in oil & gas, you’ve probably followed OEM recommendations, planned overhauls, and kept the CMMS humming. You’ve met availability targets — most of the time. So why does the reliability engineer keep pushing for “RCM studies” that seem to add more analysis than action?
The answer lies in a gap that both roles often miss: conventional maintenance treats symptoms; Reliability Centered Maintenance (RCM) treats the system’s risk profile. This article is written from both chairs — what a reliability engineer looks for, and what a maintenance engineer can learn to close the gap.
The Blind Spot: Failure Consequences vs. Failure Rates
A typical maintenance engineer’s mindset: “Which equipment fails most often? Let’s increase PM frequency there.” A reliability engineer’s first question: “What is the consequence of that failure?”
In oil & gas, a high‑failure‑rate pump might be cheap and redundant — the consequence is low. Meanwhile, a pressure safety valve (PSV) that never fails (or appears to never fail) can have catastrophic consequences if it fails on demand. This is the core oversight: hidden failures in protective devices are rarely tracked by traditional maintenance metrics.
Why Hidden Failures Are the Silent Profit Killer
Let’s take an Emergency Shutdown Valve (ESDV) on a wellhead platform. It sits there, never operated in normal production. The maintenance plan says “stroke test every 6 months”. But what if that test is bypassed to save time? Or the test is done only partially? The valve is “hidden failed” — it appears healthy on a work order but will not close when a fire breaks out.
RCM forces you to ask:
- What is the function of this device? (Prevent loss of containment)
- What failure modes matter? (Fail to close on demand)
- How do we detect it before it’s needed? (Partial stroke testing, online monitoring)
Conventional maintenance often relies on visual inspection or simple “operate to verify”. RCM pushes for condition‑based and predictive tasks that uncover hidden failures before they become a major incident.
What Maintenance Engineers Overlook (From a Reliability Perspective)
- OEM recommendations are not tailored: OEM schedules assume average operating conditions. Your sour gas, high‑H2S environment or cyclic offshore duty changes failure patterns. RCM uses your actual failure data.
- “We’ve always done it this way” masks waste: Up to half of scheduled PM tasks add no value. RCM identifies which tasks are truly necessary to prevent failure or reduce consequences.
- Reactive metrics hide risk: Focusing on MTBF (Mean Time Between Failures) for non‑critical equipment while ignoring the criticality matrix leads to misallocated resources.
- Protective device testing is often seen as a compliance checkbox, not a risk control.
RCM in a Nutshell: The Seven Questions
RCM is not a software — it’s a disciplined framework. For each asset, you answer:
When maintenance engineers skip questions 4 and 5, they default to “fix it when it breaks” or “over‑maintain”. RCM aligns maintenance with business risk.
A Real‑World Gap: Compressor Dry‑Gas Seal Failure
Consider a centrifugal compressor with dry‑gas seals. Conventional maintenance: replace seals every 3 years per OEM. But one facility saw three seal failures in 18 months — each causing 5 days of lost production.
The reliability engineer dug deeper: the failures were not due to wear, but to lube oil contamination from a poorly designed buffer system — a hidden failure mode never listed in the OEM manual. The RCM analysis changed the strategy:
- Added online seal gas consumption monitoring (condition‑based).
- Changed buffer system filtration.
- Eliminated the rigid 3‑year replacement (which introduced infant mortality).
Result: no seal failures in the next 3 years, and $2.1M saved in lost production and material costs.
How to Start Closing the Gap Today
You don’t need a full RCM analysis on every asset. Start with these three steps:
- Criticality ranking: Identify assets whose failure has high safety, environmental, or financial impact.
- Review protective devices: List all safety valves, trips, and alarms. Are they tested on‑demand? Do you have a way to detect hidden failures?
- Audit your PM tasks: For the top 20 most critical assets, challenge each task: “Does this prevent failure, predict failure, or just give us something to do?”
How Meta Infa Helps Bridge the Gap
At Meta Infa, we combine reliability engineering expertise with data‑driven tools to help oil & gas operators move from reactive to proactive maintenance. Our approach includes:
- RCM facilitation workshops led by certified facilitators.
- Hidden failure detection programs for safety‑critical devices.
- CMMS data analytics to identify waste and optimize PM schedules.
- Integration with condition monitoring systems to catch failures early.
Ready to eliminate hidden failures and optimise your maintenance strategy?
Let’s discuss how RCM can improve your asset reliability without adding complexity. Reach out to our reliability team.
Contact Meta Infa →