Predictive Maintenance

Calculating the value of avoided unplanned downtime

Win investment in reliability by estimating the value of an avoided unplanned event.

By Burt Hurlock, Azima

It’s the Catch-22 of successful predictive analytics: When unexpected events such as unplanned downtime have been all but eliminated, what’s the value of prediction? How much continuing investment does an unpredictable future justify? The hard lesson of so many catastrophic events—industrial accidents and terrorist attacks alike—is that they were avoidable. With 20/20 hindsight, the data was available to predict the events, and there was time to intervene and head off disaster. How much have the avoidable catastrophic events of the 21st century cost us, and how much would we have willingly spent to prevent them?

Events that don’t happen have value, and sophisticated industrial operators know how much. But the ripple effect of unanticipated events such as mechanical failure can carry so widely and subtly through the cost structure of production operations that most companies don’t estimate them. Not so for one company I’ve worked with, which wishes to remain anonymous—for our purposes we’ll call it the ABC Company. The ABC Company doesn’t merely estimate costs for unplanned events; it applies forensic accounting to measure dozens of variables that drive up costs as these events unfold. Understanding these variables and costs has been vital to the ABC Company’s strategy for financially outperforming its competitors. The ABC Company knows exactly what an event that didn’t happen is worth and invests in prediction, among other operating strategies, to prevent these incidents.

Most mechanical failure cost-avoidance estimates rely on notional guidelines about major repair and capital equipment replacement costs. They may even estimate opportunity costs such as lost revenue. Few estimating models go further, but they should because reality is more complicated. What about the unproductive labor hours and energy costs that accumulate during unplanned shutdowns? What about incident investigation, insurance claim processing, and remediation costs? What about wasted raw materials, potential finished product spoilage, and contractual costs, which may include penalties for failing to meet contractual obligations or costs to procure substitutes, like buying power off the grid during peak energy demand to deliver on the commitments of a baseload power plant that goes down.

The actual costs of unplanned events don’t just vary by industry; they also vary by incident. As companies take a wider view of interrupted production, they see the domino effect of unplanned events, the diversion of resources, and the waste of time, effort and capital. Properly estimating the cost of unreliability requires a cost-accounting approach to tracking and compiling the permutations of wasted inputs and the opportunity and/or replacement cost of deliverables. It may be widely accepted that “proving a negative is impossible,” but in the case of knowing the worth of heading off an unplanned event such as a mechanical failure, that just isn’t the reality. Top-quartile financial performers know exactly what events that don’t happen are worth. They even build incentive systems to reward the elimination of such events.

Four factors appear to drive how well industrial enterprises understand or seek to understand the cost of unplanned downtime. These are 1) competition (for market share), 2) capacity utilization, 3) criticality of product and delivery, and 4) culture. Any one or all of the four “Cs” may determine how urgently enterprises focus on reliability. They dictate both resilience to and tolerance for unplanned events.

For instance, companies with excess production capacity operating in mild competitive environments have less incentive to invest in reliability because unplanned downtime presents few risks. Production can shift to idle capacity, and plentiful or indifferent customers may tolerate delays, assuming that customers experience interruption at all. The risk to companies producing time-sensitive products such as energy to the grid or continuous-flow feedstock to process industries is considerably higher. The compound effects of capacity and time constraints and the logistical complexities and cost of alternative sourcing can make reliability both a key differentiator and a competitive advantage for companies that get risk management right. Finally, where market forces don’t necessarily place a premium on reliability, regulation may simulate the risks and costs of market forces.

All avoided unplanned downtime has value. How well it’s understood and managed depends on the fourth C, culture, which determines the strategic and tactical decisions companies make to chart the path to success under circumstances unique to their markets. Some companies overinvest in standby capacity and backup resources to achieve the industrial equivalent of sleeping at night. Other companies scrimp, lurching from crisis to crisis in perpetual reaction to events beyond their control. And then there are the Goldilocks companies.

Goldilocks companies raise risk and cost management to a fine art. They maintain monitoring and diagnostic information systems that have been superbly tuned over years of performance and failure observations. They understand failure modes, degradation curves, and the tradeoffs among the cost of repair, the cost of replacement, and the cost of downtime. They understand their options and proactively manage maintenance activities and capital expenditures.

Surprisingly few production assets at Goldilocks companies are new or in perfect working order. Not that they couldn’t be, but it costs a lot, and doesn’t necessarily achieve materially better outcomes. Goldilocks companies know how to optimize – how to balance investment and risk to achieve results that are just right.

The single most important ingredient to the just-right balance between investment and risk is information. Companies that know the value of events that don’t happen also understand that reliability is nine-tenths information and only one-tenth perspiration. The opposite is true for companies that don’t know the value of events that don’t happen.

Ten years ago, the ABC Company faced the perfect four-C storm, starting with a highly competitive industry in which timely product delivery is critical and production capacity is sized to match demand. The ABC Company also faced pockets of cultural resistance to its reliability implementation strategy. The company’s breakthrough insight was that consistent and accurate information would be the foundation for its reliability goals and that its maintenance and capital expenditure planning would never be better than the quality of the data informing it. Defining the strategy and tactics for a world-class reliability program began with understanding the value of events that didn’t happen. Ten years later, the incidence of unexpected events in the ABC Company’s production operations has fallen by 95%. Even so, the company continues to apply rigorous after-action forensic accounting procedures to improve its understanding of how much these increasingly rare unplanned events cost.