Control Systems / Industrial Automation / Alarm Management / Operational Excellence

Alarm management: More time to make the right decision

In this installment of Automation Zone, explore the very strong link between better alarm management and improved reliability.

Tyron Vardy is a global product director for alarm and operations management software at Honeywell Process Solutions. Tyron spoke with Plant Services in June at the Honeywell User Group event on how the operator’s role is evolving with alarm management technologies.

PS: When it comes to improved alarm management, what’s the first place to start?

TV: Historically, what alarm management tools have done is they’ve provided lots of clever ways of reporting on how the alarm system is performing. You’ve got a thousand alarms a day, and they give you a stack of very nice-looking reports, and everyone is trying to work out different ways to report on the same data. For me, the next focus or the next wave of evolution of alarm management is to start turning all those reports into something that people can actually take action on that will drive safer operations.

When you talk to most people about alarm management and process safety, they’ll always cite standards like ISA 18.2 as the guideline, or EEMEA 191, which says that they want to try and get down to something like one or two alarms every 10 minutes, which is deemed to be safe, sort of steady-state operations, which is absolutely fine.

I think we need to take a step back and say, “Why is that acceptable? Why is one or two things going wrong every 10 minutes acceptable?” Truly the fix to alarm management is to address the cause and not the effects, so to fix alarm management you need to take away the alarms. That to me is what the vision of alarm management should be, to operate the plant without any alarms or abnormal situations.

If we had this conversation one or two years ago, it would be a vision that we wouldn’t know how to deliver. But now, with advancements in cloud technology, IIoT, and digital transformation, there is a path to that vision because now you’re not just looking at the alarms – you can plug in data from any other source; you can see what the operator was doing in the field at the time. You can see what the asset management system was doing. You can see what safety overrides and bypasses were inhibited. You know, all those layers of protection that tend to get eroded away when different things happen.

I think the way that anyone should start is that, if they’re looking at alarm management on day one, they’ve got 60% of alarms that don’t mean a thing. The chances are that the operator can just close his or her eyes, and blindly acknowledge them. They don’t necessarily mean anything because these alarms have been around for the last 25, 30, 40 years, everyone knows what they are, it’s just noise and people ignore it.
So, for example, the first thing to do is to get 1,000 alarms down to 300, and that’s what the tools are great for today. Plants can get a lot of improvement in a very short period of time. Three months is not uncommon to reduce alarms by 50%. The problem then lies in what you do with the rest of the alarms, because the rest truly do need action.

You’ve got to change the mindset of the operator as well. If the operator is used to 1,000 alarms and he can blindly acknowledge 60% of them, when you’ve (now) got 300 alarms and one of those comes in, he can’t blindly acknowledge (that) anymore because that alarm genuinely means something that it didn’t mean before you implemented this whole management program.

I think alarm management is now beyond just what happens at the console. If I can get a plant and an asset and an operation to operate their equipment or their process where they’re not exceeding the limits of the equipment or the limits of the plant, then they’re not putting assets under strain and exposing the plant to risk. And when you’re talking oil and gas, for example, or chemical plants, uptime is everything. So I think you’re seeing a lot of the larger companies looking at alarm management and operational management as being intertwined. If they can manage the process better, they don’t have the alarms.

PS: Some plants are training some of their operators on basic maintenance tasks so that they’ll have more of an investment in their machine. Yet how can operators succeed when they’re bombarded with thousands of alarms?

TV: I think that’s a good point. What is the job of operations? It’s to run the plant. You know, the business wants to make money, but it has to be made as safe as possible, so no one is going to compromise process safety or people safety for the sake of more production. So you’ve got to try and get the most out of the business while maintaining operational integrity.

When we get to this vision of managing the plant, managing the alarms correctly, operators can almost move to process managers or even business managers; they can do more than what their current scope is. I think people get complacent and think, “Well, operators are there to react to the alarms.” I don’t think they are. They’re there to mitigate risk, and when things go wrong it’s their job to get things back on track.

I tend to find that if you talk to maintenance teams about alarm management, there’s a nodding approval that, yes, it’s important but no one is taking ownership of it. The production reliability guy, it’s his job at the end of the day to make sure that the business is performing and teams are meeting throughput goals, so they are the guys who will tend to drive alarm management. The maintenance team seems to be on the reactive side – the work order comes in, and they go and fix it – whereas the job of the reliability guy and the production guy is to run that plant as safely as possible and produce what operations needs, so they certainly understand it.

PS: How much of a challenge is it to get an executive team to buy into something like this?

TV: If you go back 15 years, selling alarm management felt like selling an insurance policy: “If you buy it, something might not happen.” And the answer that will come back is, well, it might not happen anyway, so why am I buying it?

That’s completely different now. When I see companies that do alarm management well, it’s always because at the very top, somebody is saying, “This is one of the priorities that we need to do.” They might phrase it in a different way. They might say, “Look, you know, we need to drive this business harder and get the maximum out of the business, but we need to make sure that we maintain the highest level of safety,” and you can’t do that with a thousand alarms per operator per hour.

There’s also a huge range between the companies that are looking at predictive analytics and cloud computing to try and solve reliability or safety issues, and those that are five years behind the curve. For those that are just starting out, they can get to where others are much more quickly than before because it’s a well-trodden path.

It’s exciting times at the moment because I think this is the first time in almost 20 years that we’re at that point where we’re going to see a complete phase change, a complete shift change in what alarm management can deliver. I think now, we’re in a state where we can start to look at true early event detection, where we can say, “We know something is going wrong, why are we waiting for the alarm?”

Truly, we need to give operators more time to make the right decision, so if we can give them 5, 10, 15 more minutes of thinking time than they have today, you create a much safer, less reactive environment. What problem are we trying to solve? Reduce the process risk, increase the operational integrity, and reduce the operator error.

For me, the vision is let’s set the bar high. We’ll probably never get zero alarms because things are always going to go wrong; it’s a combustible industry and things are always going to break. But, why is one alarm every 10 minutes acceptable, where it should be one alarm every 10 hours or every 12 hours or every shift or something? I think we have to look at it differently.