Tried and tested techniques for risk mitigation!

I think we are all of the same view that risk management is one of the growing areas of importance within the asset management arena. You see it creeping into every discussion, every solution discussed, and it is becoming one of the buzzwords of our time when dealing with how to extract greater economic value from the maintenance or asset management processes. So far so good!

There is now a lot of information out there on risk assessments. From criticality and vulnerability analyses, to probabilistic and stochastic analyses techniques, through to sampling and a whole range of other topics and themes. And, although the area is still seen as part of the black arts of asset management, the discipline in general is slowly coming to terms with how to define and recognize risk. What is often not discussed is what to do about it!

We often go down an RCM type path and start to look at maintenance interventions, detective tasks and other such things. But what else can practically be done to mitigate risk in the modern asset management arena? In my book, The Maintenance Scorecard, I discuss risk as part of the corporate target setting as well as how to proactively measure our exposure to risk.

But we need to go a little bit further and look at some practical, simple and pragmatic applications of risk mitigation in the real world. This short article contains a few pointers to help you to mitigate the effects of risks that your organization may be carrying. I have deliberately steered away from the standard areas of maintenance tasks and training, and looked to larger areas where the potential for return is greatest. At all times this article is about the larger field of asset management, rather than just maintenance.

1. Risk of obsolescence Capital investment is usually spilt into four general areas these are grouped to represent most companies’ type of spending break-up, so it may not match perfectly with what you company is doing. But you get the picture!

New equipment to meet increasing demand,
Capital maintenance spending on large item refurbishments and replacements
Capital spending to keep up with new technology and to beat obsolescence.
Capital spending on modifications and design changes.

Within these areas, depending on what’s going in your industry sector and company, a large percentage is often devoted to technology and beating obsolescence. Part of the argument goes something like this. “This machinery will be obsolete within the next four years, so we need to upgrade the equipment to ensure we are able to get parts for it and to keep it running!” A sound pretty proactive doesn’t it? Because it is! At least people are thinking ahead about when something is likely to become harder to manage in terms of parts sourcing, repairs, and the associated skill sets to support these. However, prior to spending the capital on buying the “new and improved” asset, it might pay to work through the following cost-justification.

Are there parts available for it now?
How long are you going to need it to run for if you don’t replace it?
Aside from parts, are there any other reasons for this change? (Such as skills shortages and other such things)
What is the cost of replacing this asset, versus the cost of buying heaps of parts for it now!
What’s the cheapest option over the life of this asset!

Using this strategy I have helped a number of organizations to delay capital spending, or remove it altogether, find the parts while they are still at a reasonable price and availability, and to mitigate the risk of breakdowns and not being able to get the right parts and equipment. Often with a forecast saving of millions of dollars! 2. Risk of human error (part 1) As assets become more and more reliable, one of the dominant sources of failure modes is that of human error. There are a range of types of human error, and a range of methods for analyzing their probability. (Personally I use H.E.A.R.T wherever possible) However, one of the key reasons for operator, and maintenance, error is due to structural failures. This could be easily misunderstood so let me elaborate a little bit. Structural failures in this case refer to the inability of people to do certain tasks to the level of performance that are required of them, because of the way that assets are designed or configured. Inability to get their hands into certain areas, inability to stand up while in other areas, these are classic structural human errors. However, some of these error types are even closer to home. For distributed infrastructure type companies, such as rail, water, electricity and gas, there is the tendency to route all alarms to a central management area. Where an operator, or several operators, are to work through these alarms, decide what is the more important, and then determine the actions that could be taken for each one. More often than not this revolves round taking the decision to call out somebody during off-shift times. There are two errors that are common in these situations; the first is the volume of alarms coming into the centre is overwhelming the operations staff there. The second is where operators are unable to see all of the screens that they need to see in order to make relevant real-time judgments about what is going on. Where there is an overwhelming number of alarms being generated the risk is obviously that something dire will occur and nobody will be able to react in time. Or that there will be some less than important issue that cannot be tackled before it becomes a critical issue. To mitigate this there are ranges of alternatives, however, the most effective that I have used is an extensive review of the alarms being generated using reliability style principles. In the past this has resulted in a reduction of up to 33% of operational alarms generated, while adding some critical alarms that were overlooked. This has enabled call centers for operational management to reduce the incidence of nuisance alarms and to be able to manage with their current labor levels. If this step is taken and still operations are not able to cope with the alarms coming in, then maybe it is time to review the staffing level of the alarm-monitoring centre. Where there are problems with operators seeing all of the relevant information at once there are generally two alternatives. Either you can change the configuration of the software allowing for more detailed views on fewer screens, (not as difficult as it sounds at first, but probably pretty expensive), or change the configuration of the control room to allow operators to see more screens at once, more alarm indicators at once or other similar change. Obviously the potential for risk mitigation in these areas is great. While there is the temptation to try to pull everything away from fallible humans with our short attention spans, this sort of technology is still quite expensive and out of reach commercially. So this may provide an easy option for mitigating the risk of important, or soon to be important, events slipping under the radar. 3. Managing inventory It is difficult to speak about risk mitigation without discussing strategies for managing the physical asset inventory. In our business, the management of risk drives inventory management. Not only that but most algorithms and approaches today use only historic information to try to predict future usage. (Within parameters of course) So there is a built in error in this field. Holding inventory costs money, particularly if you are sleepwalking down the path towards 95% service levels without truly understanding if this is required for your asset base. So how can you deal with this? Within this field there are ranges of strategies to share the risk of holding parts. Of these one that I prefer to use when I am able to is that of vendor held stock. This option allows you to shift some of the risk to the vendors. Through managing the contracts that you have in place better, you can arrive at arrangements where the vendors hold certain levels of stock on your behalf, with the guarantee to deliver this within a specific timeframe. Holding costs reduced to practically zero, and if the contract is managed right then the risk of not having the parts is eli9minated also. 4. Managing whole-of-life costs This particular strategy has been adopted in a widespread fashion by those within the mining industry and others managing fleets of mobile equipment. (With limited use in the rail industry currently) Look to develop a partnering arrangement with your equipment providers. This needs to be a fair arrangement whereby you share the risks of the whole of life cost s of the asset. Basically, the vendor is asked to provide a unit cost price for the running of the asset, scaled to represent the rising cost over the life of the asset. (If this is applicable) And financial agreements are made to ensure that any variation from this is either compensated, or negotiated. In the mobile equipment field these sorts of arrangements are exempt from operator damage and accidental damages in particular. In one swoop you should be able to gain some form of long term control over the in-service portion of you whole-of-life cost management, thus mitigating some of the risks associated with cost blowouts. While at the same time encouraging an active partnering approach from your vendors in the lifecycle management of the assets. Not a bad level of risk mitigation through looking at things a little differently. 5. Managing turnarounds A simple technique that is probably well in use at your plant today. When planning a turnaround there are two significant risks that you are running. The first is that you are going to miss all of the relevant tasks and end up with a breakdown shortly after returning to work, the second is that you are going to load up the turnaround period so much that you could easily blow-out the timeframes, costs, and reduce the valuable uptime that your company is planning on for production targets. When planning a turnaround resist the urge to include everything just because you can get to it, and look at performing some simple for of cost benefit analysis to see if the additional costs and time taken to do it in the shutdown is less than the consequences and likelihood of it failing before the next turnaround opportunity. Summary I hope these five short tips have given you some ideas for how to take some practical steps to mitigate risk within your company. At all times this article looks at the picture of asset management, not just the maintaining of the asset once it hits the ground. This is part of the evolving view that I have detected within our discipline, away from traditional views of maintenance (grease the bearing / take the reading) and towards a more complete view of managing assets over their entire life. (From conception to disposal) Interestingly, it is an article on risk that doesn’t go into the complexities of probabilistic analyses!