The new parts of the reliability engineer's job: PdM strategy and the IoT

The internet of things (IoT) is being relied upon increasingly in industry to help support asset health, and the RE can serve a vital role here. Leveraging IoT and complementary technologies requires the application of skills and expertise that most organizations don’t have in-house. The RE will likely partner with an IoT service provider to set up, access, and analyze data from sensors and devices and convert them into actionable instruction. The RE or gatekeeper in the feedback loop uses this intelligence for condition monitoring, risk management, and reliability improvement initiatives.

Predictive technology sensors are strategically located on critical machines and communicate via the cloud. The cloud provides the infrastructure for streaming, analyzing, and storing data for more thorough advanced analytics later. These systems can gather information and statistics from the data to be used for process and reliability optimization. This feedback goes through the RE for validation and follow-up action. It essentially allows for online condition monitoring of plant equipment so failures can be predicted and necessary repairs planned and executed prior to functional failure.

Read "Get your reliability game running: Know where you are and where you're going"

Because one of the RE’s responsibilities is to manage the predictive maintenance strategy, he or she must possess or develop the skills necessary to manage an online condition monitoring program. The aim is to predict when equipment failure may occur and to prevent failure from occurring by performing planned corrective maintenance. Predicting failure typically involved the use of tools and processes such as vibration monitors, thermography, ultrasonics, tribology and motor analysis. Data collection has historically been route-based and conducted with the use of handheld devices.

With the advent of the IoT, online condition monitoring with advanced analytics is now available on a large scale. Equipment condition is continuously monitored and readings are compared with defined parameters. This enables the tracking of patterns, or combinations of patterns, that might indicate equipment failure. To manage the PdM strategy, the RE must have a fundamental understanding of predictive technologies. Manual data collection with handheld devices will be necessary to validate cloud-based failure predictions.

Reliability engineers need to develop a working familiarity with the following areas of technology to remain effective in their role.

(1) Predictive technology sensors. The RE will need to understand the fundamentals, application, and maintenance of wireless sensors. In addition to process sensors (that detect, for example, pressure, water quality, gas, smoke, level, motion, and humidity), the following predictive technology sensors are widely used in online condition monitoring:

Vibration sensors to monitor the vibration of equipment
Temperature sensors to monitor temperature variation
Infrared sensors to measure the heat an object is emitting
Oil level sensors to measure variation in oil levels
Acoustic sensors to detect changes in ultrasonic sound made by the equipment
Motor voltage and current sensors to monitor for corona, arcing, tracking and imbalance.

(2) Real-time condition monitoring. The IoT platform has the ability to process real-time streaming data as fast as it can be collected, allowing for quick response to changing conditions. IoT software captures and aggregates huge amounts of data from connected machines and immediately analyzes it using predictive modeling to ultimately deliver intelligence for corrective action. The RE will likely partner with peers in information technology, process engineering, and cloud services to maximize system capabilities.
Online condition monitoring also helps the RE determine when assets are nearing end of life. This lets the operations team plan for these assets’ replacement and disposal.

(3) Big data analytics. Big data analytics generally involves the use of cloud-based software to monitor and analyze signals from typically thousands of wireless sensors strategically placed on critical assets. It then triggers the necessary maintenance or operations actions based on rules, conditions, algorithms and models defined by the RE and process engineering.

The RE should become familiar with:

Streaming analytics, used to analyze huge dynamic data sets. Real-time data streams are analyzed to detect situations that require urgent and immediate action.
Spatial analytics, used to analyze geographic patterns to determine the spatial relationship among objects.
Time-series analytics, used to analyze time-based data to identify trends and patterns.
Prescriptive analysis, which uses descriptive and predictive analysis to identify the best course of action to take in a particular situation in light of given parameters and priorities.

Proactive root-cause analysis (RCA) methods have not changed, but the huge amount of cloud data available provides the RE with the ability to statistically validate root causes. There is always risk associated with applying solutions to probable root causes not validated. Big data analytics complements the RCA process and also helps the RE manage risk.

(4) Building failure models and machine learning. The RE will have the opportunity to build failure models to generate P-F curves for planning corrective action. This requires knowledge of the reasons for failure or failure mechanisms, identifying the combination of key parameter values that indicate failure, and using statistical data analytics and mathematics to build the model. These failure models serve as P-F intervals.

Machine learning is used in those instances where you can’t define a failure model for your equipment using advanced data analytics. Machine learning is integral to the way data is processed, allowing algorithms to find impending failures. The RE should first build competency with data mining and modeling techniques before attempting machine learning.

(5) Digital twin technology. Digital twins are virtual representations of assets and processes that are used to understand, predict, and optimize performance. A digital twin is built with asset data by simulating asset performance in different usage scenarios under varying conditions.

Models based on input factors such as associated risks, operating scenarios, and system configuration can be used to simulate a range of business outcomes such as total expected cost of maintenance and system unavailability over a period of time. The RE will need to develop advanced analytical skills to design and deploy different simulation models.

(6) Prescriptive maintenance. Prescriptive maintenance (RxM) takes advantage of cloud technology to detect asset degradation before functional failure occurs and prescribe corrective options to mitigate the identified problem. Multiple scenarios are run; possible outcomes are weighed; and then a decision is made for system operations and maintenance.

This approach can significantly improve the maintenance organization’s effectiveness and minimize maintenance spending. The RE should have a fundamental understanding of RxM technologies and apply them where benefits outweigh costs.

(7) Augmented reality. A significant amount of the RE’s time is spent tracking down equipment history, drawings, and other key information while troubleshooting equipment in distress and during root-cause failure analysis. Augmented reality complements online condition monitoring by providing the reliability professional with on-the-spot visualization of maintenance problems in their infancy to aid in problem resolution.

The troubleshooter uses smart visual displays to guide them to the asset in distress and overlays key information relevant to the equipment (e.g., O&M manuals, schematics, maintenance history, cloud data, advanced analytics, etc.) to guide the endeavor. This technology will minimize diagnostic time and improve repair maintenance quality, thereby positively affecting plant reliability.

(8) Drones and unmanned vehicle technology. Some failure modes remain unmitigated because of difficulty accessing areas where affected assets reside. Now, some maintenance organizations are using unmanned vehicles to conduct inspections on infrastructure and other facility assets in hard-to-reach areas. REs should consider using drones and reality modeling to enhance productivity and safety in asset inspections.

New technologies are developing rapidly, and many of them have the potential to help manufacturers operate more safely and more profitably. The reliability engineer who can harness new technological approaches effectively will remain an influential and highly valued member of the engineering team.

Click here to read "What does it really mean to be a reliability engineer today?"