PdM after CMMS

Puget Sound Energy takes it to the next level.

By Ed Espinosa, CMRP, Puget Sound Energy

PdM and CMMS

Read Ed Espinosa’s previous article about PdM and CMMS implementation at Puget Sound Energy at http://www.plantservices.com/articles/2013/08-why-pdm-programs-fail/.

It’s been a number of years since the selection and implementation of your CMMS has occurred. You’ve spent an enormous amount of time creating templates for various CMMS modules and then populating them with decades’ worth of what now seems to be insignificant legacy data. You’ve spent months populating these, and now it’s time to scrub them for errors. Your scrub template finds the expected errors that you overlooked. These must be repaired before proceeding. You spend even more time than anticipated correcting these before submitting these templates to your company’s EAM support group for back-end CMMS loading into a sandbox environment.

You and your team have worked very hard to achieve this milestone in your company’s MRO development. So now you have a functioning CMMS. You attend a tradeshow to benchmark how your company is doing compared to others to see where you can improve your process by increasing efficiencies, reliability, and availability, and, at the same time, reducing costs. You listen to a presentation detailing the tenets of reliability-centered maintenance (RCM). You take what you’ve learned to your leadership, and the decision is made to perform a RCM implementation. A living RCM program would most certainly be beneficial in achieving the goals identified earlier during your benchmarking exercise.

Much like the decision to implement a CMMS, the one to implement a living RCM program is equally enormous and resource-intensive for many months, and possibly years to come. Integrating the results of this study in the CMMS again is labor-intensive and demands heart-borne dedication for the benefits to be realized. Initial results are mixed and may be too early to judge if the RCM project delivered the results promised.

pdm1
Figure 1. Your MRO team needs to standardize how they execute maintenance across your facilities.

In an attempt to better ascertain the results of an expensive RCM implementation, management decides your MRO team needs to standardize how they execute maintenance across your facilities (Figure 1). This need translates itself into another project from corporate, and, as all projects go, this one is no different in manpower requirements to implement it fully. Your management team decides that some sort of standardized maintenance management system or work management process is long overdue since it’s difficult to measure results with scattered, unconnected data. And, if results cannot be measured, then they can’t be managed, and, if they can’t be managed, then by the same reasoning, long-awaited improvements to your process can’t be realized. And if your process doesn’t improve, all the work and money that has been invested into your organization yields questionable returns, at best. This scenario is not a wise use of your company’s resources thus far. So, what needs to happen next?

Next, the consultants arrive on-site and yet another project begins, a long tedious and painful journey that takes time away from your real job, the work of running an effective MRO team or, at least, that is what you’d like for them to be considered. You have endless meetings to go to. Some of these are to discuss project milestones, others to discuss budgets and timelines, and yet others to discuss lessons-learned. All of this takes time, time spent to find out you are behind schedule in rolling out this work improvement project, and of course you and your team find yourselves behind in the real work that matters, the work that keeps your operation functioning and keeps the lights on.

When the consultants leave your company a year from now, the new work process is clunky at first but, with time, your crew gets the hang of it. Your KPIs are published for the first time and it clearly shows there’s plenty of room for improvement. So another journey begins to improve your process. The benefit of KPIs points to areas of where to improve, a process to be done incrementally. And, again, like most anything worth striving for, process improvement will take time.

pdm2Your organization has accomplished the following implementations: CMMS, RCM, and work management process (WMP). You have been operating with the new programs for a number of years. You and your senior management are happy with the results but feel there is more to do to take your organization up to the next level. So where do you start? What else is there to improve upon? Your senior management wants to push the limits of performance by pushing the MRO envelope to eliminate all defects and thus all failures.

The next level in the maintenance continuum for many facilities in this situation is achieving a state of mind where failures are not merely anticipated and contingencies measures kept in hot standby, but rather eliminating failures by eliminating the causes of failures. The next logical question is what are the causes of failure? To answer that, we need to look at the failures themselves to provide a hint in order to arrive at an answer.

So where does one find failures? One finds failures documented within your CMMS. They are found in your CMMS assuming you and your team are following work management philosophy of identifying and documenting work, work not already embedded in your CMMS as preventive maintenance — designed to generate into orders on a given frequency, but work stemming from observed materiel deficiencies or corrective work.

Corrective work or what some folks like to refer as reactive work, since as a maintainer of equipment, you are reacting to a situation, is identified by three means. As plant readings are being performed, the reading taker is trained to look, listen, and feel. The maintainer is trained to use senses to detect the minutest changes in perceived conditions such as the sound rotating machinery makes or the temperature of a motor as detected by the back of the hand or the sound of the slop in belts rotating on a pulley. This method may seem primitive, but it certainly captures the pulse of the process.

Another way to identify corrective work is through inspection. The work management system recently implemented should have a means set aside for doing this. Sections of the plant are divided into periodic inspection routes designed for the sole purpose of looking for and finding materiel discrepancies. These discrepancies are then documented in the CMMS and, if agreed to by the MRO staff, acted upon to restore equipment to new or like-new status.

And finally, the next means of identifying corrective work is by the execution of preventive maintenance. During planned inspections, either invasive or indirect, material discrepancies are uncovered that initially were not apparent to a process variable reading taker or conventional system walk-down external inspection. An example of PM inspection uncovering corrective work is predictive maintenance. PdM has the ability to find the incipient problems otherwise known as functional failures in many cases long before catastrophic failures occur.

pdm3The process of root cause failure analysis (RCFA) is an ongoing process, part of any MRO maintenance cycle. RCFA is the main element that drives continuous improvement, however small the strides to achieve this may be. RCFA done properly will detect hidden failures. There are generally three levels of fact-finding that leads to what the real cause is after the fact that failure and ensuing damage have occurred. Each level of failure starting with Level 1 identifies the damaging event and its potential causes. Beyond this, Level 2, one must then look for a human cause for the damage such as a procedure not followed. The third level, if pursued, might indicate that the procedure if followed had material flaws inadvertently written into it. Engaging in periodic RCFA activities as part of your MRO’s routine is being proactive.

So, how is documented corrective maintenance within the CMMS corralled together to extract meaningful information as a basis to take pointed action to reduce failure and increase reliability per RCFA? The functionality to capitalizing on this aspect lies within your CMMS. Modern CMMS software has the functionality to analyze failure. This process involves the use of performance measurements dependent on operator input partly and standard system timestamp features started upon initial CMMS documentation.

A work request notification is entered into the CMMS. To successfully save this record, the CMMS has required fields that need to be populated such as equipment number, functional location, and possibly work priority. The CMMS has several more fields that may not be required for record saving but need to be populated nonetheless for proper failure analysis. Failure analysis is data-driven. More data is good. More data that is accurate is optimum. Your CMMS may have anywhere between a few and a dozen related failure analysis fields to populate. Closely review these fields and understand the information they provide and decide if you need these populated or not.

Failure analysis is tied to equipment records representing operating assets in your facility. A key failure analysis field related to the asset is object type. The object type feeds the object part field in the work request notification. When rolling out the CMMS implementation, the functional location/equipment template used to back-end-build the hierarchy will have a field that, when populated, satisfies the object part data requirement. The object part consists of a series of characters identifying what type of equipment it is. The character string should have information identifying pump, or valve, or fan, or gear, for example.

The ways that equipment fail are represented by codes that are preloaded within the applicable CMMS module during the implementation and are listed in a dynamic value list for the end user to select when assigning a type of damage to equipment failure. Not all equipment is subjected to the same damage that causes them to fail a certain way. There may be overlap but not always, so it is best that each object part have a subset of damage codes assigned. When populating the damage code portion of the upload template, make sure you use a wide-range, robust set of damages. This list can be found in most MRO textbooks.

pdm4Just like damage codes, cause codes are equally important. Populate this list of cause codes onto the upload template and please give it quite a bit of thought, spurred on by cross-functional team discussion. The cause codes can be unique to the damage codes with some spillover to other damage types. Again, a textbook on the MRO subject matter can provide an adequate list of causes to different damages.

Once the corrective order is complete and “actuals” are being recorded, make sure to enter what the solution was to correct the damage and restore the equipment. This is the remedy to the problem. Recording this bit of information may come in handy in future failures because the track record would exist on what was done in the past to correct the problem. It may be discovered that the older remedy to the problem may have introduced a defect into the process that ultimately led to another failure, which means that another remedy needs to be developed and implemented. The only way to know for certain is to take the extra time and record the data in the CMMS. This information may hold the key to which actions must be taken to reduce downtime from several days to only several hours.

Lots of accurate data accumulated over the years is optimum. Now combine this with analytical functionality such as mean time between failure (MTBF) and mean time to repair (MTTR) via malfunction start and malfunction end commonly found in CMMS, and you’ve got a toolset that provides a path for equipment reliability improvement. You improve reliability by minimizing or eliminating the causes of failures altogether. If you eliminate the cause of failure, you essentially eliminate failure. And that is measureable, quantifiable improvement. Again, the elements that make this analysis possible with such rewarding results is defining an object type (what category of equipment), damage code (what is wrong with it that prevents it from operating), cause code (what is the culprit that triggered the damage), and remedy code (what was done to repair the damage and restore the equipment).

Edward EspinosaEd Espinosa is program manager, CMMS, at Puget Sound Energy in Bellingham, Washington. Contact him at edward.espinosa@pse.com.

The other piece of the RCFA functionality toolset that would help immensely in cutting to the chase is the Pareto chart. The Pareto principle states that 80% of your problems are caused by 20% of your equipment in an industrial setting. A Pareto chart displays the data in such a way as to make it obvious what the biggest problem is.

An example of how to apply a Pareto chart to your MRO application is finding which equipment has the highest occurrence of failure, which has the second most failure occurrence, and so on. It ranks equipment failure occurrence from highest to least, from left to right. You now know what the most troublesome equipment is.

With this information, you can now concentrate scarce resources on the most troublesome equipment to garner the most maintenance bang for the buck. Further analysis can tell you what damage said equipment sustains and, taking it a bit further, what the causes of that damage are.
It focuses your effort to obtain the highest return with minimal expenditure. That is a quick win in anybody’s book.

Remember when tediously populating those cumbersome data templates and thinking these have no value for your future use of CMMS, think again. The data on these templates may very well hold the key to future process improvements through an even and steady improvement of equipment reliability.