Failure Reporting - Getting Down to Business!

I find that many plants get too detailed with their failure reporting. When I look at failure coding schemes, it is often mindboggling. Sometimes, we have hundreds of different failure codes, which leaves operators and maintainers frustrated over which code to choose and who should do the choosing. Moreover, the failure reporting system often reports only equipment related problems, ignoring other factors that interfere with the success of the business and it often co-mingles failure modes (what went wrong), failure mechanisms (how it went wrong) and failure causes (why it went wrong). Let’s discuss how to set our Failure Reporting process straight.

I think where we go astray is that we try to use our failure reporting process to automate our cause analysis process. These are two very important parts of a Failure Reporting, Analysis and Corrective Action System (FRACAS), but they needn’t be performed simultaneously in all cases. Failure Reporting, the first part of FRACAS, should be conducted for all events. But in reality, we need only perform Cause Analysis on an exception basis – Apparent Cause Analysis (ACA) for small or easy to solve problems and Root Cause Analysis for larger, more complex problems. When we try to turn failure reporting into a automated, combination failure reporting and cause analysis process, we run into trouble – hundreds of failure codes, too much complexity and no adherence to the failure reporting process.

I like to keep failure reporting at a high level. How is it affecting the business? Here’s the game plan for business level failure reporting. This process is illustrated in the attached figure.:

1. Identify the Affected Function. This is at the manufacturing process level – dry material feed, wet material feed, mixing, baking, packaging, shipping, etc. Usually, we have conveyance and transfer processes between these function. Report anytime something happens, even if it didn’t affect the overall line of the production. If we have one function that acts badly but doesn’t affect our overall results, we may have “FAT” in the system, such as excess WIP inventory, excessive redundancy, etc. Improving the inherent reliability of the process might enable us to lean out that function.

2. Report Your Business Level Failure Mode. At a business level, the Failure Mode is simply a loss of Availability, Yield (speed) and/or Quality. Yes, you got it – those are the elements of Overall Equipment Effectiveness (OEE), or Overall Business Effectiveness (OBE) as I prefer to call it – more on the distinction later.

a. Availability Losses. Availability is simply the number of hours we’re running divided by 8760. Yes, that’s the total number of hours in a year. Account for every hour so we know where our production time went and why. We’re not interested in maximizing reliability, availability or OEE/OBE – only optimize them so that we maximize Return on Net Assets (RONA). We need to account for every hour.

b. Yield (Speed) Losses. This is simply the production rate per hour versus our best sustained standard rate. Report our performance compared to perfection – irrespective of market demands – we’ll discuss this later.

c. Quality Losses. This is simply the percentage of our total production that is scrapped, requires rework or must be sold at a lower grade level. Distinguish between the different types of defects where possible.

3. Report Your Business Level Failure Effects. Normally, this is in dollars or production units. At the functional level, we may not be able to report in dollars if the stoppage, slowdowns or defects aren’t affecting our total output. But with production units, we can do some what-if analysis and get to estimated dollars. Also, report any Health, Safety and Environmental impacts or near misses and risk-based effects (e.g. insurance claims, etc.) too.

4. Define Your Business Level Failure Causes. This is going to seem to simple, but at the initial level, you need only identify why the loss occurred in general terms. These include:

a. Marketing. If we’re in an undersold position, the losses (availability, yield or some combination) are marketing induced. Or, if our manufacturing system has been asked to perform beyond its capabilities, our losses are marketing induced.

b. Supply Chain. If our losses are attributable to lack of raw materials quality or quantity, they are supply chain induced.

c. Production. If our losses are attributable to production functions – equipment changeover, improper set-up or adjustment, incorrect control or operation, the losses are production induced.

d. Equipment. If the losses occur because the equipment has failed to perform as designed, we attribute the losses to equipment.

You can see why I prefer the term OBE to OEE. Many (most) of our losses are actually attributable to something other than the equipment. By categorizing our failures, losses and near misses appropriately, we get a look at how we’re doing across the entire business.

The Failure Reporting process is intended to help us see where we need to probe further. Simplifying it and raising it to the whole business, automates the generation of your OEE/OBE metric and points out areas where we need to delve further with Analysis and Corrective Actions – rounding out the FRACAS approach. This fact is simple, there’s no FRACAS without Failure Reporting, and there’s no Failure Reporting if it’s more than the organization can, or is willing to, handle.

I look forward to your comments and inquiries!

Email me with questions at drew.troyer@sigma-reliability.com