Reliability CSI: piling the corpses

One of the foundations of reliability engineering is the promise that problems will be solved once, and the data will be used to prevent recurrence of the same problem. Sadly, the gulf that exists in most organizations between reliability engineering and operations people tends to prevent delivery on this promise.

From a practical standpoint, this situation can be both good news and bad news. It's bad news because of the unnecessary production losses and maintenance expense that is wasted repetitively "solving" the same problems. It's good news because, given the right instructions, a smart engineering student can, in the course of a summer, usually identify enough highly leveraged maintenance and reliability work to pay for himself or herself many times over. Of course it doesn’t have to be a summer intern. Technical people who are temporarily available for medical or other reasons can also perform the simple steps below.

Piling the corpses must start with a program sponsor who secures funding or full time availability of an employee for eight or ten weeks. A locked room, sometimes called "the morgue," should also be available for the duration of the study. Anyone who is interested in the program but can't deliver the manpower and space needed will have to secure the support of a sponsor who can.

Using a ten week template, piling the corpses might go something like this in a medium sized, single site. The last day of each week should be used cataloging the data gathered during the week and reviewing it for an hour with the sponsor:

Week 1, learn the campus. Beginning with a list of interview appointments managers, arranged by the program sponsor, have the intern spend twenty minutes each with each leader at or above the foreman level. The discussion should be scripted and designed to identify the five or so pieces of equipment that cause more than their share of maintenance headaches. If there are processes that cause similar problems, they should be noted as well. This week will help the intern learn the layout of the operation and get a sense of which leaders are going to be forthcoming and interested in solving their key problems. The leaders should also be encouraged to suggest interviews with a few other people who might shed additional light on problems. If names come up from more than one manager in this context, the discussions will probably have identified a local problem solver. This is noteworthy.

Weeks 2 and 3, gather all available root cause analysis (RCA) or other formal problem investigation records and catalog them. No time limit applies to these data. Gather them all. There is usually a very limited number of leaders who conduct this kind of analysis. Find out who they are and meet with them. Identify every problem that has had a formal analysis and obtain copies of a front page or summary sheet for each. If they don't have the department, asset number of the problem equipment, equipment owner (at the shop floor level), problem description, and cause determined in the investigation, add this information on an index card that is stapled to the cover page for the problem.

Weeks 4 and 5, pull together a list, or if necessary a stack, of the last year's maintenance work orders sorted by asset number. Review the list with the equipment owners in each factory area. If repair time or cost data are available, use these, along with frequency of occurrence information, to identify the equipment with big problems. Of course this should square with the problem equipment lists from Week 1, but mismatches do occur, and the work order data will add new information to the stories where it does match. If appropriate, differentiate among shutdown work, planned maintenance, and corrective work performed during production time. Identify as many high-maintenance assets as possible during the two week period. Usually a small percentage of equipment will consume 50 percent of maintenance effort. This is the picture the intern is looking for. Associate the work with the equipment requiring it. Don't forget the Friday data compilation.

Week 5, identify the repair parts purchased during the last year and list them in quantity order. If data are not available for MRO purchases, begin with a list of the major MRO part and supply vendors and identify the most frequent and costly purchases from them. Starting with the most frequently used items, list them and identify the equipment on which they are used. Generally nuts, bolts, washers, and other common hardware should not be a part of this analysis. Bearings, exotic hardware, OEM parts, and other repair items are of primary interest. They are usually expensive and tend to create large amounts of maintenance work. The parts need not be very expensive. Often the big ticket item is the work of installing them. Filters, catalysts, production supplies, and maintenance supplies can often point to problems. Often these purchases will have been made on blanket orders or from overhead accounts. This does not diminish their importance.

Weeks 6 through 10, pile the corpses. Starting with the problem equipment identified in Week 1 and adding other offenders as they come to light, establish folders with the summer's information on each piece of problem equipment. If an interested financial person can be available to help, it should be possible to identify approximate production loss and maintenance costs for the worst offenders. If not, creation of the business case for ending repetitive failures can begin after school starts. The project sponsor should have learned enough about the cost of repetitive maintenance that he or she won't let it go unaddressed.

In most plants the folders for the worst equipment will contain enough information to support the kind of corrective action discussed in this year’s Strategic Maintenance columns entitled "Data Driven Maintenance." Part I begins here: www.plantservices.com/articles/2012/08-Strategic-Maintenance-data-driven-maintenance.html. It may be appropriate to extend the data gathering on these assets to more than a year. It is very likely that they are the kind of perennial problem assets that drain productivity and money from plants everywhere.

Piling the corpses and determining the lessons they contain for plant operations is the best way to identify the equipment issues that are constraining productivity and profit for any operation.

Sophisticated data systems will sometimes simplify the process, but the need to build a list of failures and assign their true costs remains. Once the big piles of corpses are identified, the real work of data driven maintenance begins.