Podcast: How PM optimization improves reliability and reduces unplanned downtime
Key takeaways
- PM optimization starts with clear goals: inspections should yield identical results from both seasoned and new technicians.
- Many PMs lack value—use metrics like PM/PdM yield to identify and improve or eliminate ineffective tasks.
- Aligning maintenance with planning and scheduling reduces downtime and prevents reactive repair chaos.
- Trimming bloated PMs and focusing on useful inspections can boost uptime, cut stress, and streamline production.
In this episode of Great Question: A Manufacturing Podcast, Thomas Wilk, chief editor of Plant Services, is joined by Brian Hronchek, principal trainer and consultant with Eruditio, to discuss a topic in the air these days: PM optimization. These efforts start with the question, “What is the outcome that you want?” And that outcome is for the seasoned technician and the new technician to be able to return exactly the same results when they do the same inspection on a mechanical asset.
Below is an edited excerpt from the podcast:
PS: Hi, everyone and welcome to a new episode of Great Question! A Manufacturing Podcast. I'm Tom Wilk, the editor in chief of Plant Services magazine, and today I am coming at you live from the Leading Reliability 2025 conference in Clearwater Beach, FL.
I'm with Brian Hronchek, who is principal trainer and consultant with Eruditio, although he also says just call him “Coach.” So Brian, welcome back to the podcast, Coach!
BH: Thank you, Tom, thanks for having me.
PS: Thanks for having me back at the convention too. This is such an amazing setting. And despite the beach and the sand, I know we're having some amazing conversations about reliability.
BH: Yeah, this is this is one of the best conferences to come to just because it's all about learning. It's not about sales, it's not a bunch of booths where you have to go get pressure to buy something. It's just hearing from other people like you that have been through challenges, and telling you that you can get it done. Sometimes that's half the battle – believing it can be done.
PS: Last year, you helped us understand how to do that when it comes to root cause analysis and FMEAs. Today, though, we're going to talk about the topic of the workshop that you gave here at Leading Reliability, which is PM Optimization. Before we started recording, this is something that you and I both commented on that we're hearing, especially in 2025 for some reason, PM optimization is suddenly in the air in conversations, and you’re hearing a lot of that, right?
BH: Definitely, we've done so many workshops for either conferences or individual companies and seeing that light come on, in the realization that all this, we've either been doing it wrong or we haven't understood really how to do it. And now having that tool, it has been fun to see that change in the customers.
PS: Given that PM optimization can cover a huge variety of tasks and improvements, what are some of the specific tactics or PM's that you see people asking you for advice on? Does it come down to a kind of technology or is it a kind of process we’re looking at?
BH: Yeah, it's really understanding and having the tools in your pocket to be able to answer the question of, “how do I optimize this PM?” It starts with, “what's the outcome that you want?” And I don't mean the technical outcome, I mean really what is the outcome that you want.
The outcome we want is that the seasoned technician and the new technician, the guy that was delivering Domino's on Friday and now he's a qualified mechanic on Monday, right, you want them to be able to return exactly the same results when they do the same inspection. And if you look at most of our PMs and you ask yourself and even take a test and hand the PM inspection to two different technicians, an old one and a brand new one, and have them go out and do the same inspection, can they return the same results? Most of the time the answer is no.
PS: Interesting. And so as an exercise, I imagine once you do get people on the same level for these PMs, it does build a certain change in the culture where suddenly people have confidence that the process will lead to the identical outcome every time, right?
BH: Yeah, and so that's what we want on the on the management side of it, or on the reliability engineering side, is we want the same results. But what the technicians really want and really crave – and I'll say that with air quotes that they “crave”, right – is just being supported in doing their maintenance jobs.
So the PM inspection itself is not for the inspection. It's so that you have enough time to plan, schedule, kit, do those things for the job so that the repair can be done in a completely non-reactive planned and scheduled type manner, rather than sending a guy out into the unknown to find parts, find tools, figure things out, and get injured.
PS: You bring up the intergenerational aspect, or at least the veteran person on that workforce team versus the newer person on the team, that doesn't have to be generational, just has to be older employee versus a fresher employee on the job. Do you find that there are certain aspects of the job that newer employees have trouble with?
I'll give a specific example that I heard one time: someone new to a plant wanted to impress what a good worker, what a strong worker they were, so they were over-torquing bolts that were attaching motors to the floor, and eventually that was resulting in the bolt head being sheared off. In fact, the person for the right reasons, wanting to impress people and show them they were diligent, was doing the wrong thing, which was just completely over tightening this bolt. Do you see things like that, that are endemic to fresher workers trying to make an impression, or the older workers who thought they knew what the process was?
BH: Yeah, and we have to make a differentiation because we’re standing in the conversation, we're standing in a fork in the road right now, and down one path is PM's and down the other path is repairs. Something that's good to level up on before we go any further is “what's our definition of PM?”
When we look at PM's and I can say from our customers perspective, if we ask them, we say what's a PM and they go, “oh yeah, that's find and fix, we go and look at it until we find something and then we fix it right away.” That's not my definition of a PM. Or some others, it's “periodic maintenance.” Well, that's still not my definition of a PM.
When I talk about preventative maintenance, we're really focused on the inspections to find a problem so that it can be planned and scheduled. There are time-based PM's where you're actually replacing a filter, changing oil, doing something like that on a time-based or on a meter-based (schedule). But for the most part, we're really talking about, “go look at it and if it's good, check it off, and if it's bad, tell me what it is so that we can get somebody to plan it, so we can go back and fix it.”
In general, the torquing and the tensioning and those types of things are associated with precision maintenance and repairs rather than with inspections. But I completely agree with you, you want to go out there and be the guy and get the get the kudos and earn your place on the team, and so you go over-do it and then of course you induce a failure, and we don't want that either.
PS: Right. Yeah, I forget what the stat is, but I think it's simply a majority of machine failures are introduced by people touching them in incorrect ways.
BH: Yeah, only about 20% of all of the things that fail happen because they actually wore out or because of the system; 30% are because maintenance touched it, and 50% is because operations touched it. I don't mean that in a way like anybody was malicious or anybody was untrained, because sometimes it's just the recipe, sometimes it's the operations recipe and how we run the machines and there's a resonance, or some setting is incorrect and we just don't know it. Potentially it's because the PM we give the technician is too frequent and he's touching it too often, even though he's doing the right things. Not to put any onus on the individual doing something wrong.
PS: I think it's an important thing because so often that can turn into a blame game, and part of what you're saying about optimizing your PM's is that you do need to know these data, but you've got to come at them also from the sense of everyone's doing their best. You don't want to point fingers just because, say, operations was 50% of the failures; that happens to be a team where we can say, “OK, what is it about the process in general that we might look at and do differently?”
BH: Yeah, it's a completely different topic, but precision maintenance is one of the ways we can reduce the amount of failures that are induced by maintenance. Of course, PM optimization is another one, But then on the operations side, a lot of us don't talk about it in the maintenance and reliability world, but precision operations – having standard work, having Visual Factory, having 5S and TPM, having all these things in place on the operations side also reduces the likelihood that operations is going to induce one of those failures. But that's another topic for another day.
Only about 20% of all of the things that fail happen because they actually wore out or because of the system; 30% are because maintenance touched it, and 50% is because operations touched it.
- Brian Hronchek
PS: Stay tuned, I'm sure we'll do a fourth or fifth podcast in our series together. Back to PM optimization then: are there types of PM's that you see teams wanting to tackle first that maybe they shouldn't? Or do you get a sense of when you do meet people in the field that they do have a good handle on what PM’s need to be looked at first?
BH: I don't know that most of us have a good understanding of it enough, especially on the application side, out in industry. Now our team, we've had a couple of years to look at this problem and say, “you know what, there's not enough out here,” so we've written a book on how to do PM optimization. We've built this workshop on it, so we’re starting to really get our heads wrapped around it, but that's not common knowledge in industry yet.
If I look at from the customer point of view, we know that PM optimization is good, but we don't know how to do it, so we just start throwing things at the wall: “well, let's optimize all of our PM's.” It's like, well, are they bad? Maybe, honestly it’s likely that most PM's are pretty bad, but do you know that they're bad? Why are you choosing those PM's?
What we would want to teach is first of all ways to look at your PM's in general, to be able to determine if there's even a problem, and we can use SMRP metrics – PM / PdM yield is in the fifth pillar, the work management pillar, and that is an excellent tool for figuring out if we should even do PMO, right? But then once you determine, OK, I think this group of PM’s is not producing the results we want so we should optimize them, the next question is, “what does good look like?”
PS: Do you find that when people do find the PM which needs a second look, that it's more the case that it should be eliminated? Or do you find that it's a PM that is still good, but needs to be fixed. Because I've heard a lot of people do want to reduce the number, the volume of work that is not useful any longer.
BH: Yeah, that's an incredibly important point, is reducing the amount of work. You know those apps out there that say, hey, we can take a look at your finances and tell you where you're wasting money.
PS: Oh my gosh, so many commercials say “download this app, you'll take a look at your subscriptions, blah blah blah.”
BH: It's completely funny because about 10-15 years ago I used to bounce a lot of checks. I have the app for my phone that said, hey, we'll connect all your bank accounts and everything, and tell you exactly how much money you have. That's great because I would look at the app, I would refresh it and says, hey, I've got $100 in the bank, I'm going to buy a hamburger. What it wouldn't tell me is that tomorrow I had my utility bill scheduled and all of a sudden that utility bill bounced, and then the water bill bounced, and the electric bounced, and then my rent bounced and you know, whatever it is.
I was disconnected from my finances because I was letting something or someone else manage it for me, right? All my money is flying out the window and I'm still making decisions to send more of it out the window, and yet I'm not getting any better even though I have a tool that tells me where all my money is going. It's like OK, this is a bit of a reactive tool because I'm not doing my job. So I changed that, I put everything in a in a spreadsheet and I started having daily touches and I knew where everything was going, and quit bouncing checks, so I'm going to pat myself on the back. Good job, buddy!
The same thing is happening if we think about our maintenance resources – our labor, our time, right? A lot of our labor is going towards things that completely cannot produce value. It's like the Netflix and the Hulu and the Xbox 360 subscriptions, and our kids don't live at home anymore and they don't use them, but we're just continuing to send that money out the door, and then you finally realize, oh my God, I can cancel that subscription, and that puts money in my pocket. That didn't change any other problems, but it puts more money in your pocket.
So the first step in PMO is doing that scrub and determining against a set of criteria, “is this activity, is this inspection valuable or is it not?” Could even the seasoned technician give us any results out of this? We have a customer who's actually downstairs right now, down at the conference who actually had an inspection that had 1,800 inspection points. They would give themselves 8 hours to do 1,800 inspection points. But their version of PM was find & fix, not inspect and then plan & correct, so they would get about two or three hours into this 8-hour PM and they would find a handful of things that needed to be fixed, and they'd spend the next three days fixing them.
Guess how much credibility they had when they said, “we'll do a PM and we'll give it back to you in eight hours”? Never. They're like, well, we're not going to do this very often, so how about once a year let's do a PM. Well that’s a terrible frequency for something, especially when it fails more often than that. Going through PM optimization, I think they went from 1,800 inspection points down to like 400 inspection points, and those last 400 inspection points were polished up and cleaned up to really be valuable.
Then what they did is they changed their downtime for the PM from 8 hours to 2 hours. They said, “hey, we don't need 8 hours, we only need 2 and we promise you we're not fixing anything, we are just looking.” Operations got on board and like “well, we can live with 2 hours” and they tried it a few times. What they would do is like a pit stop, everybody dives on it, everybody gets their specific inspections, records the data, dive off, turn the thing back on, and let's spend the next two weeks planning the repairs, and then let's come back for 4 hours and fix it. And they have gotten such incredibly, incredibly huge gains in performance in that area and now starting across the plant as it expands.
PS: That is tremendous. We're talking about throughput or quality or both?
BH: Everything, absolutely everything. I mean, not only the availability and the uptime, but they actually spend less time down maintaining it. They have fewer days where production is impacted because that 2-hour window doesn't impact them enough to really care, so they're like, we can give it up for 2 hours on any day. “Two hours? OK, fine, but just give it back within 2 hours, we're good.” The stress on the production planners, the stress on the operations managers, all that stuff is just completely changed.
PS: Let me bring something which is subtext up to the main text, which is that when this team put the PM optimization plan into place, they did something really important – they did what they said they were going to do. It was 2 hours. It wasn't more. They resisted the temptation to fix things. And so you've got this new reputation, if you will, or a different kind of attitude coming from that team, and the operators can sense, “OK, yeah, they did what they said they were going to do.” It was an investment in future PM’s, because with this one, they absolutely adhere to the terms of the agreement with operations.
BH: Definitely. Building trust gives you the opportunity to expand and do it in other places. Now the organization as a whole has transformed two years ago from struggling to keep up with production month after month, and now they've actually hit a point where I think they're like 6 or 8 months above production goals. They've been really, really doing well. It's been a slow journey, but that journey is a steady journey and it doesn't have that volatility, and it's just continuous to grow a little bit every month.
PS: This reminds me too, of something which Klaus Blache down at the RMC in Knoxville, always says, which is that so many success stories, if not all of them, start with this: “here's how operations and maintenance figured it out together.”
We'll get you out on this question, Brian: when it comes to PM optimization, you know the title of your workshop was PM Optimization Made Simple. Are there any specific insights like the one we just talked about – do what you say you're going to do – which are really key to helping these optimization efforts succeed?
BH: Yeah, there's a few and we should get a little bit technical in the conversation because I don't want to leave everybody with, “oh, that was such a great story but how do you do it?” There's always building blocks. When we had the FMECAs podcast last year where we talked about each question is to answer the next question. Everything is the same, PM optimization is the same.
The first thing we need to understand is, what is a failure mode right? How do we define it? I can tell you I've been in FMECAs and I've been in different conversations as a reliability engineer. I didn't have a good grasp of what it is. And if you ask everybody, they might give you different definitions of failure modes and failure mechanisms and part and problem and cause, and how do they all fit together? So the first thing we have to understand is, is that equipment doesn't fail, right? I ask in the class all the time, “does your car fail?” And I get a lot of nods and I say no, you're all incorrect. The car does not fail. The components in the car fail. The head gasket blows. The bearing seizes. The transmission gear tooth shears.
PS: Right. There's no Blues Brothers moments where the car just explodes and dies. It's always something.
BH: Yeah, the car itself doesn't fail, it's the components that fail. And when we look at the P-to-F curve and when we look at the six failure patterns, we're talking about the component or the part that fails, not the automobile or the equipment that fails. So understand that first and understand that every inspection that we do is based on a component, not on the asset.
Say we have a crane inspection. You have a whole bunch of little inspections that are on bearings and couplings and base bolts and hoist reels and cables. Each component gets inspected to make up the crane inspection, which is a PM plan or a PM route, right? So the first thing we’ve got to differentiate is we're looking at each individual inspection point, each individual inspection, not the whole plan by itself. And then we have to look at what is the failure mode or the failure mechanism, or the problem or the cause that is being addressed by the inspection, right? So if we define a failure mode – and we may have talked about this last year, but we'll hit it again – how would you define a failure mode? What's the definition for that? What I will tell you is I don't care if it's technical or I don't care if it's in layman's terms – the failure mode is the deviant behavior of the component.
First, I'm going to say the bearing is a component. The bearing is supposed to spin. What's the opposite of spinning?
PS: Seizing up.
BH: Seizing, right. So be technical, tell me seizing. Be untechnical, say stop spinning or don't spin no more. I don't care how you put it. That is the failure mode, it is the opposite of what it's supposed to do.
And then we go further down. So if the part is the bearing, the problem or the failure mode is the seizure, what is the cause of the seizure? And now we start opening up to the things that we can look for: over-lubrication, under-lubrication, misalignment, contamination, imbalance. All of these things are conditions or causes that lead us to the bearing seizure, but those are things that I can find.
So now if I've defined my part, my problem, and my cause, I can design something to prevent the cause or to find the cause so I can prevent the seizure or prevent the problem.
So that's the first question I'm asking, is does the task as it's written give me any indication of what failure mode we're trying to prevent? And I don't care if it's “look for play in the shaft.” Well, what's the failure mode associated with play in the shaft, or what's the cause, the deviant condition, the deviant state? And I can look between the lines and say, oh, well, that's looseness in the bearing or that's wear in the shaft. There is something that I'm looking for that is the problem. Now I can answer yes to that question. There is a failure mode being addressed with this task. It's just not very clearly stated.
And since I can answer that question, now I can ask the next question. Should we use predictive technologies, or should we do a quantitative inspection, or should we do an objective inspection, or should this be given to an operator for basic asset care? Starting with the foundations and really understanding the definition for failure modes and for part-problem-cause, if we can understand how to wrap our head around that at the component level, then walking through the logic of doing PM optimization becomes much easier to perform and much easier to facilitate.
And then on the other end we encourage all of our customers to write their PM inspections with four basic elements:
(1) The first element is the part and the problem or the failure mode, so “this activity prevents bearing seizure.”
(2) The second element is the cause: “due to under-lubrication,” OK, “this prevents bearing seizure due to under-lubrication.” That's the first thing I want to put on the PM inspection because I want to empower the technician to be able to act on my behalf. I want him to know that his work today has meaning behind it, right? This is not a check in the box. It's not walk out there and go out and check, check, check and turn in the paperwork, it’s not that. It’s “oh my God, you know what I did today? I made sure that we're not going to have bearing seizures in the plant, and I know what happens when we have bearing seizures, it stops production and everything, so my life is meaningful today.”
(3) The third element is the actual activity and the inspection criteria, the activity and the tolerance. The activity might be “grease a certain amount” or “inspection of a certain tolerance” or something along those lines. I want the measurement or the inspection criteria and then also I want the tolerance – plus or minus so much, or this much grease, or whatever, right.
(4) The fourth element is the conditions and the follow-up actions. “If you find this, then do this.” “If you find this, then I want you to do that, but I also want you to write a work request for a for a repair.”
If you can put those four elements together, what you end up with is very small variation in what you get out of the result. Because now, and let's go with an objective inspection, right, if I tell somebody “go inspect the hoses,” that's very subjective. But if I say “inspect the hoses for any presence of bulging or cracks, if it's tangible to the touch or if it's visible with the eye, I want you to report it.” Now when I get down to the conditions, you know, “Condition #1, no visible bulges or cracks.” Record as-found / as-left. Here's Condition #2, “some cracks or bulges were identified.” Record the as-found, write a follow up for repair or for replacement.” Now when I give it to the seasoned guy who's been around and he's been seeing leaks all of his life and he's like, “oh, it ain't that bad, the cracks and bulges, it's not spraying anything so it's just fine.” He's going to ignore it, except that I said, “any presence write up a work request for repair” and now what he provides and what the new guy provides, because I told him “tangible to the touch or visible to the eye, write a work request for repair,” they're going to return me exactly the same thing. And that same thing gives me time to plan and schedule that repair and prioritize it on our schedule, instead of on the machine’s schedule and scheduling the work for us.
PS: Right, all of which is adding up to what you led with, which is to make sure that anybody who touches the machine according to this PM gets the same outcome.
BH: Yeah, completely, and that's really the goal. Those four elements, we call that the gold standard – the part & the problem; the cause; the activity and tolerances; and then the follow up criteria, follow up conditions or actions. We would say that every inspection point, if you have those four things in it, then you are gold and you will be able to do some excellent inspections.
PS: For those who want to learn more after hearing our talk, where can we point them to?
BH: Go to eruditio.com. Of course we do the workshop, you can call us and we'll come run a workshop for you, or come to a conference and you can see the workshop sometime. But if you go to eruditio.com and you click on Resources, we have a brand new page called Job Plan Guides.
What we've done is we've created a Microsoft Word template that is auto formatted that prompts the user for the next piece of information. There is a how-to video that goes with it, so you can download the template, download our worksheet for actually scribbling notes during PM optimization and organizing your thoughts, and then you can watch the video on how to dump all that into the template and really empower your planners or your reliability engineers to start pumping out some excellent, not just excellent as far as content, but excellent as far as functionality and beauty in the paperwork without sending them back to school for how to use Microsoft Office.
PS: Excellent. For anyone listening, you'll find those links already in the show notes in the transcript area. So Brian once again, thanks for the next podcast in our series and I can't wait to talk again.
BH: Yeah, it would be good. Thank you, Tom, and I look forward to coming back.
About the Podcast
Great Question: A Manufacturing Podcast offers news and information for the people who make, store and move things and those who manage and maintain the facilities where that work gets done. Manufacturers from chemical producers to automakers to machine shops can listen for critical insights into the technologies, economic conditions and best practices that can influence how to best run facilities to reach operational excellence.
Listen to another episode and subscribe on your favorite podcast app
About the Author

Thomas Wilk
editor in chief
Thomas Wilk joined Plant Services as editor in chief in 2014. Previously, Wilk was content strategist / mobile media manager at Panduit. Prior to Panduit, Tom was lead editor for Battelle Memorial Institute's Environmental Restoration team, and taught business and technical writing at Ohio State University for eight years. Tom holds a BA from the University of Illinois and an MA from Ohio State University