How does one demand reliable equipment? Most industrial managers dont know the answer to that question, and too often, theyre part of the problem.
Management policies of the past shape the future, good or bad, and many times, management develops policies that tell purchasing to buy at the lowest cost. Engineering is told to meet a deadline no matter what it takes. Vendors are ordered to deliver the parts and equipment on time, at the lowest cost, with no requirements for a specific level of reliability of the parts and equipment. Maintenance is urged to, Hurry! Lets get the production line up and running.
What these great policies have in common is their goal is to save money and time. But as they begin to take effect, companies find either asset reliability is becoming less controllable, or theyre starting to spend large amounts of money to preserve it.
Whats needed is optimal reliability at optimal cost at all times. A company must spend only what is required for a specific amount of reliability, and no more. I call it, reliability on demand.
Because most companies dont understand this process, major reliability problems begin when new equipment is introduced or a rebuild is completed. The immediate effect is lost money and often, ultimately, lost jobs.
What you dont know is killing you
What most people dont know about asset reliability is that most failure modes fall into the infant mortality category (Figure 1). Infant mortality is a term used when new or overhauled equipment breaks down shortly after startup. Originated in the health care world, a medical dictionary defines it as, when a child dies under the age of one.
In the world of asset reliability, we define infant mortality as the failure of an asset during startup, within a short period of time after new equipment is installed, or soon after the equipment has been overhauled to a previous state condition. This short time between when equipment is started up and fails could be minutes, hours, days or months. Reliability studies typically show that about 68% of known failure modes are a result of infant mortality.
So, most failures are likely to occur soon after new or overhauled equipment is started up. As most equipment is operated, it becomes less likely to fail. At some point, the probability of failure levels out to a plateau known as random failure, which allows failures to be detected through a proactive maintenance strategy. Infant mortality is difficult to identify and detect before failure occurs.
On first encounter with Figure 1, most people dont believe it. The first time I saw it, I thought, No way most failures come as a result of infant mortality. I was wrong. Lets think about this for a minute. I recommend you perform a basic experiment to see if infant mortality is prevalent in your operation:
Step 1: Measure the mean-time-between-failures (MTBF) for a production line or process area during some time interval (t) while its running in a steady state. MTBF = t divided by the number of emergency work orders. For example, at steady state, 160 hours divided by 10 emergency work orders gives a MTBF of 16 hours.
Step 2: After the next shutdown, when equipment has been overhauled or new equipment has been added, measure MTBF again for the same time period: 160 hours divided by 20 emergency work orders gives a MTBF of eight hours.
Step 3: Compare the data.
The first step in resolving any problem is to acknowledge you have a problem. Once this has been achieved, a company is ready to begin the journey to reliability on demand.
Why equipment dies
Causes of infant mortality can be broken down into a few categories (Table 1). To truly reduce the likelihood of infant mortality and improve reliability, those issues (and more) must be addressed and prioritized by risk.
1. New/overhauled equipment
2. Maintenance issues
3. Production issues
The U.S. Department of Defense (DoD) is very much aware of infant mortality. It has completed research to identify how to make equipment more reliable so that once it is commissioned or overhauled, it will have a high probability of operating failure-free for a specific time. Extensive DoD standards and procedures have been developed to optimize reliability. The DoD reliability standards, process and more can be found at www.enre.umd.edu/publications/rs&h.htm.