Have you considered a career in Reliability Engineering?

June 9, 2020
Here is your guide to the several branches of Reliability Engineering, and the certifications to match.

Are you an engineering student looking for a specialized field? An entry-level engineer looking for your first job? Or maybe a practicing engineer looking for a change? Do you possess both an aptitude and passion for figuring out how things work and improving their performance? If you answer yes to any of these questions, consider a career in the dynamic field of Reliability Engineering.

Personally, I transitioned from a Maintenance Engineer role to a Manufacturing Plant Reliability Engineer. I’ve been fortunate to develop competencies and gain valuable experience that wouldn’t have been available elsewhere. The rewards, not only monetary, have been beyond my wildest expectations. Accepting the job of Reliability Engineer (or RE) was one of the best decisions I’ve ever made. Maybe this article will lead you down a similar path.

What is Reliability Engineering?

From a technical perspective, Reliability Engineering involves an iterative process of reliability assessment and improvement. These improvements could range from more robust designs to failure mode mitigation utilizing predictive maintenance technologies.

Specifically, Reliability Engineering applies scientific knowledge to components, products, software, systems, plants, or processes to ensure that they perform their intended function, without failure, for the required time period in a specified environment. Reliability has two dominant dimensions: (1) time, and (2) environment or stress. A part has to perform its desired function throughout its life despite adverse conditions or stresses applied to it. These stresses include temperature, corrosion, vibration, shock, fatigue, and other environmental factors.

From a non-technical perspective, Reliability Engineering requires creativity, dedication, and a high level of perseverance. The most successful REs are very diligent and work well with other functions. REs need to be able to solve problems effectively, learn along the way, and make decisions quickly.

Essential soft skills for an RE include problem-solving, teamwork, composure under pressure, written and verbal communication, and relationship building. Being able to solve problems effectively requires the ability to work well with others. REs shouldn’t be expected to know all the answers, but they need to be able to draw upon expertise within the organization.

Reliability Engineering can be divided into three major branches:

  1. Manufacturing Plant Reliability Engineering
  2. Reliability Design Engineering
  3. Site Reliability Engineering

Manufacturing Plant REliability Engineer

The primary role of the Manufacturing Plant RE is to identify and manage asset and system reliability risks that could negatively impact business operations. The RE works to achieve maximum asset and system uptime by tracking and finding ways to minimize production losses and unusually high costs to maintain equipment and systems.

This RE manages risks to attain strategic objectives using tools such as criticality analysis, failure mode and effects analysis (FMEA), root cause analysis (RCA), and critical spares analysis. REs manage the Predictive Maintenance Strategy (PdMS) and the online condition monitoring programs for their area of responsibility. In addition, the RE provides engineering support in the design and installation stages of new assets; the modification of existing assets; and technical support to facilities management, technicians, and production personnel to mitigate operational issues.

The RE typically utilizes a computerized maintenance management system (CMMS) and/or enterprise asset management system (EAM) for managing assets, creating asset maintenance plans, recommending spare parts levels, reviewing and assessing maintenance plans, and generating reports for assigned metrics and business decisions.

For smaller plants or facilities, this RE will likely be responsible for the reliability of all assets. For large plants, however, the Manufacturing Plant RE role is usually broken down into three areas of specialization: fixed, rotating, and electrical equipment. These jobs typically require a four-year degree in engineering.

  • Fixed Equipment RE. Fixed equipment is static equipment typically used in the process industries. Some examples include pressure vessels, heat exchangers, piping, storage tanks, valves, pressure-relieving devices, boilers, furnaces, heaters, and structures. Fixed equipment REs address the issues that impact the mechanical availability of fixed equipment, providing solutions which maximize the mechanical integrity and reliability of the fixed equipment. Typical areas of responsibility include risk based inspection (RBI), mechanical integrity programs, inspection program development, facility and equipment condition assessment, expert application of applicable codes and standards, extracting and verifying technical data, and generating risk profiles.
  • Rotating Equipment RE. Rotating equipment is dynamic equipment used in the process industries to move fluids, gases, and other process materials. Some examples include blowers, compressors, engines, gearboxes, material handling equipment, pumps, and turbines. Rotating equipment REs provide technical guidance on rotating equipment and improve the safety, environmental standards, overall reliability, and operating cost of the plant. They are accountable for increasing equipment reliability by improving time between failures of rotating equipment while reducing equipment downtime and manufacturing costs. The RE develops strategies to manage assets at peak performance, optimize lifetime return on investment, mitigate reliability risk, and support capital improvements.
  • Electrical Equipment RE. Electrical equipment includes anything used to conduct, control, convert, distribute, generate, measure, provide, rectify, store, transform, or transmit electrical energy. Some examples include bus ducts, circuit breakers, transformers, motor controls, disconnects, panelboards, variable frequency drives, programmable logic controllers, digital control systems, motors, and cables. Electrical equipment REs are responsible for implementing and guiding efforts across the organization to ensure reliability and maintainability of electrical equipment and processes that could adversely affect plant operations. The RE develops and institutionalizes reliability program strategies and elements, as well as champions the transfer and implementation of best practices. They collaborate with other teams to maintain and improve the overall plant electrical system, providing support on high-priority electrical issues.

The majority of fixed and rotating equipment REs possess a mechanical engineering degree. Electrical equipment REs almost always possess an electrical engineering degree. Applicable certifications for Manufacturing Plant REs include the Reliability Engineering Certification and the Certified Maintenance and Reliability Professional.

Reliability Design Engineer

The Reliability Design Engineer (RDE) evaluates and qualifies new product designs for reliability. They plan and implement accelerated life tests; write test reports; lead design failure mode and effects analysis (DFMEAs); and perform reliability budgeting, estimating, and reliability risk mitigation.

The RDE follows the reliability lifecycle of products from concept to design, development, manufacturing, field operation, and field returns to design in and confirm reliability at every stage. They also generate maintenance task analyses and maintenance plans, and conduct maintainability demonstrations to help translate customer requirements into product specifications.

Typical responsibilities of the RDE include:

  • Facilitating DFMEA sessions in order to drive reliable design choices and improve validation test planning and assemblies;
  • Implementing reliability methods to build in and validate that the targets are met;
  • Analyzing usage and environmental conditions from the field in order to improve requirement setting and testing methods;
  • Creating fault trees and reliability block diagrams to assess system reliability;
  • Devising accelerated test methods to explore failure modes and boost reliability;
  • Specifying reliability validation plans for components and subsystems;
  • Facilitating failure analysis to understand root causes and drive resolution for failures from testing and the field;
  • Developing key performance indicators; and,
  • Managing qualification testing.

Reliability Design Engineer jobs typically require an advanced degree in engineering or physics. Applicable certifications include the Reliability Engineering Certification and the Certified Reliability Engineer.

Site Reliability Engineer

The Site Reliability Engineer (SRE) applies software development skills and mindset to information technology (IT) operations, with the goal of improving the reliability of large systems through automation and continuous integration and delivery.

SREs promote system reliability and efficiency throughout the software development lifecycle. They ensure the reliability and availability of cloud-based platform services, query execution, data processing, and more. SREs work closely with product developers to ensure that the designed solution responds to availability, performance, security, and maintainability requirements.

Typical responsibilities of the SRE include:

  • Ensuring service uptime and performance;
  • Designing and implementing automation, observability, and growth;
  • Configuring new service infrastructure and upgrading existing services;
  • Researching new technology and running experiments to overcome issues and plan for future demand;
  • Diagnosing problems across the network, server, operating system, and application;
  • Utilizing configuration management tools to deploy and maintain services;
  • Providing technical support to the platform service users;
  • Creating and refining metrics, monitoring, and alerting mechanisms to provide visibility into production services;
  • Writing software as necessary to ensure system reliability;
  • Managing the lifecycle of work orders assigned to the platform; and,
  • Software reliability and validation testing.

Minimum qualifications for SRE positions include a bachelor of science degree in computer science or a related discipline, or equivalent experience. Applicable certifications include the Certified Site Reliability Engineer.

Reliability Engineering Career Opportunities

Engineering students interested in the RE field should look for co-op or intern opportunities through their respective schools. New graduates can find jobs through their college or university, as well as through job sites like Monster or Indeed.

I encourage practicing engineers to look within their own organization for open RE positions. If none are available internally, you’ll be amazed at the number of RE postings online. I wish you the best and good hunting!

About the Author: Michael Blanchard

Michael Blanchard, PE, CRE, is a reliability engineering subject-matter expert with Life Cycle Engineering. Blanchard is a certified Lean-Six Sigma Master Black Belt and is dedicated to helping teams achieve their goals and sustain gains. Contact him at [email protected].

Sponsored Recommendations

Filter Monitoring with Rittal's Blue e Air Conditioner

March 28, 2024
Steve Sullivan, Training Supervisor for Rittal North America, provides an overview of the filter monitoring capabilities of the Blue e line of industrial air conditioners.

Limitations of MERV Ratings for Dust Collector Filters

Feb. 23, 2024
It can be complicated and confusing to select the safest and most efficient dust collector filters for your facility. For the HVAC industry, MERV ratings are king. But MERV ratings...

The Importance of Air-To-Cloth Ratio when Selecting Dust Collector Filters

Feb. 23, 2024
Selecting the right filter cartridges for your application can be complicated. There are a lot of things to evaluate and consider...like air-to-cloth ratio. When your filters ...

ASHRAE Standard 199 for Evaluating Dust Collection Systems

Feb. 23, 2024
This standard ensures dust collection systems are tested under real-world conditions, measuring a dust collector's emissions, pressure drop, and compressed air usage. Learn why...