In the second half of 2012, a catastrophic environmental event occurred at a facility. No human lives were lost; however, there were multiple deaths of an endangered species. The organization had spent tens of millions of dollars in engineering, equipment, and training to avoid such an accident. In addition, it supported public outreach and fully embraced environmental stewardship.
We were asked to do a root cause analysis and to develop recommendations, so that the likelihood of similar failures could be avoided.
We prefer to use the ProAct method of root cause analysis developed by the Reliability Center (www.reliability.com). The philosophy behind the method is that failures occur as a result of a series of events, not a single event. There are different types of root causes: physical, human, and latent. The ProAct method is preferred because it is an intuitive, straightforward approach that can be applied easily.
The first two types of root causes, physical and human, are the typical types of roots that most people think about when analyzing a situation. Latent roots are the roots that are the responsibility of the leadership and management side of the house. Most of the time, policy, procedures, training, and decisions to fund or not fund activities, oversight and so forth are dominant, but less visible root causes.
There always seems to be a tradesmen or operator that is the most visible person in a chain of events. In this case, the task the fill-in operator was asked to do was a task that was infrequently done by anyone.
|Tom Moriarty, PE, CMRP, is a former Coast Guardsman, having served for 24 years; an enlisted Machinery Technician for nine years; earned a commission through Officer Candidate School; and retired as a Lt. Commander. During his final year of service, 2003, Tom was selected as the U.S. Coast Guard’s Federal Engineer of the Year; an award sponsored by the National Society of Professional Engineers (NSPE). He is a member of the Society of Maintenance and Reliability professionals, the past Chair of the American Society of Mechanical Engineers (ASME), Canaveral Florida Section, and a member of the ASME Plant Engineering and Maintenance (PEM) Division. He has a B.S. in Mechanical Engineering from Western New England College, and an MBA from Florida Institute of Technology; Professional Engineer (PE) licensed in Florida and Virginia, Certified Maintenance and Reliability Professional, various credentials in management and reliability fields. He can be reached at firstname.lastname@example.org.
|Subscribe to the Human Capital RSS feed|
At the onset of the task, his supervisor talked with the fill-in operator and asked if he wanted to go over the procedure for doing the infrequent task. The operator raised his voice and angrily stated, “I’ve been working on these systems for 16 years. I know what I’m doing.” Not wanting to tee-off the operator, the supervisor backed off and said no more.
The fill-in operator went, unsupervised and unsupported by other operators, to the system and began the task. He operated the system in a way that severely stressed lifting equipment and the asset itself. The result was catastrophic failure of the system. The system failure meant that the risk to the protected species was increased dramatically. Other weather conditions beyond anyone’s control changed the behaviors of the protected species, increasing the animal’s interaction with the failed structure. A number of protected species were killed as a result.
The physical roots, of course, included the damaged system. The human roots included the actions of the overconfident fill-in operator. Latent roots included not having a written procedure for the task, as well as not having a training program and periodic refresher training. But the organization also did not have a checklist or a safety or process observer for infrequent tasks. In addition, it had poorly written standard operating procedures (SOPs) that didn’t cover operations when the system was degraded or damaged. It also didn’t have a preventive maintenance task that would have avoided the conditions that required the task being performed under duress.
At the end of the RCA, a draft report with all the facts, description of the various root causes, and recommendations was distributed to various managers and supervisors in the client organization. One supervisor returned comments that the report was factual, but it seemed to him that 90% of the fault was with the fill-in operator for not following procedures or letting his supervisor know that he needed refresher training.
The old adage that when you point a finger at someone there are three fingers pointing back at you is fully in effect here. The supervisors and managers in this organization did not provide good training, did not keep training records, did not have clearly written procedures, did not ensure people knew how to carry out the tasking or had needed support. Moreover, when the system was damaged, there was no heightened sense of awareness of the risk to wildlife. They did not have increased vigilance nor did they adjust guidance to operators on how to operate the system to reduce risk to the operators. The operators had an SOP that was poorly written and not recognized by management as deficient.
It’s easy to find a scapegoat to deflect responsibility. Senior leaders, managers, and supervisors have to ask, “Do I know what I’m doing?” Are your training programs, SOPs and oversight putting your team in a position to succeed?