The idea of using reliability data to plan maintenance and future investments is quite appealing to many maintenance managers, but the topic seems to polarize opinions. Some people feel that with the help of software tools, historical maintenance data can lead to a good prediction mechanism for future failure behavior. Others dismiss the idea, feeling that the data available is too poor for serious prognoses.
The reliability engineering literature and many scientific papers don’t address this issue nor do they provide concrete examples of what levels of data are acceptable and how wide the margin of error will be. For these reasons, ABB Full Service and the ABB Corporate Research Life Cycle Science group decided to conduct a case study. The goal was to see whether a statistical analysis of CMMS data could be used as decision support for maintenance and investments. This article summarizes our findings and conclusions.
We tested several commercial reliability engineering tools. The case study site was a paper mill with a full service contract - all service activity was outsourced. The data used for the reliability calculations were taken from the site’s CMMS.
The specific equipment used in the study had a well-documented history in the CMMS with a number of entries in the single digits for each failure mode during the past three years. The data quality was generally good with only a few minor uncertainties. The equipment was considered troublesome in that it was clearly not “as good as new” after repairs. Instead, the intervals between failures seemed to be decreasing. The question arose whether it was possible to predict the approximate time of future failures and to make a cost/benefit analysis to determine whether a replacement was cheaper than future repairs.
Given the relatively thin database, it was clear that there would be inaccuracies. The goal of the study was to find evidence for answers to the important questions.
Can trends in equipment failures attributable to repairs be identified and are these trends pronounced enough to justify investments? In other words, is there a way to estimate the effect of multiple repairs on future times to failure?
Regardless of the existence of trends, can reliability analyses based on the data found in a CMMS system provide a good estimate of failure behavior? A particular focus lies on the amount of data - is there enough information to attempt such estimation or will the error margin be too high? Can the failure behavior estimation be used to schedule preventive maintenance activity and inspections? Can simulations provide a good prediction of future cost to use as a basis for maintenance budget planning?
Case study approach
To find answers to these questions, we chose the following approach. As a first step, we retrieved data from the CMMS and complemented it with other information. The main result was a list of times between failure (TBF) and times to repair (TTR). The TBFs and TTRs were used to determine the proper failure and repair time distribution functions. We relied on a commercial software tool to suggest a distribution function and to calculate its variables.
For the imperfect repair model, we calculated an age reduction factor using the so-called “Kijima II” method. Instead of a perfect repair that resets the age to zero, the imperfect repair model applies a factor to the age at the time of failure and continues the aging process with this “virtual age.”
We used a commercial simulation tool to build a simulation model of the critical system components based on the reliability curves and age reduction factors. We used the model to compare different scenarios:
- Investment: We compared the cost of an investment and the resulting effect on maintenance using a scenario that assumed nothing was changed. We tested both perfect and imperfect repairs.
- Maintenance planning: We compared maintenance strategies, which consisted of different combinations of inspections and preventive measures. The simulation tool provided some suggestion on what intervals and measures to use.
Determining the distribution functions
The commercial tool provided a suggestion for both the type of the distribution function and its variables based on the data taken from the CMMS. The failure distributions the software calculated had an acceptable value for the correlation coefficient and goodness-to-fit test (Chi-square or modified Kolmogorov-Smirnoff), which seemed to suggest a low margin of error for the final calculation. We also used the Kijima II method to determine the age reduction factors to represent the deterioration that multiple repairs cause.
From a theoretical standpoint, some of the suggested distributions were surprising as they indicated decreasing failure rates for mechanical components that typically are at least constant if not increasing because of wear-out.
For example, our model suggested that a transmission chain would get more reliable over time, the rate of breakage decreasing with age. For this reason, we also used the distribution functions recommended by theory with the same field data. The rationale was to test how great the effect of a “wrong” distribution function was.
Indeed, the choice of distribution had great effect on the simulation results. For example, the same data could lead to increasing and decreasing failure rates based on whether we used a Weibull 2 or Weibull 3 distribution. This difference has a great effect on preventive maintenance measures, especially preventive replacement of equipment.