fbpx

In his Failure column for this issue Tim J Carter talks about the perils of ignoring known failure risks; the value of monitoring equipment condition for signs of deterioration; and the role ‘what if’ failure response scenarios can play in minimising resulting damage and maximising failure response rates.

Click to download and read pdf

Monitoring equipment failure risks Apollo NASA Tim J CarterTo fail to plan is to plan to fail. How often have we heard this? It’s been around for so long it’s grown a beard. Yet, there are many times when failure arrives unannounced and not prepared for. And producing a plan when the crisis is upon you usually means working all hours, day and night, and a few more besides.

To be prepared for failures in an engineering system is to plan for an eventuality that is certain to arise. And that it will probably happen when you are standing alongside your braai with a cold one in your hand and few friends is, perhaps, inevitable. It was ever so. Disaster never strikes when you are ready for it, but when you least expect it. Murphy was an optimist. If your name happens to be Murphy, I apologise. I know it wasn’t you. The Murphy I was referring to is fictional and malevolent, but as predictable as Angela Merkel’s hairstyle.

Having ‘all the spare parts you might need’ sitting on a shelf in the stores will help in two ways. First, these will only be needed for planned maintenance – for the systems that never fail unexpectedly (Murphy again!) – which results in the ire of bean-counters, who look at the amount of money tied up in inventory. Second, when something does break unexpectedly, critical parts will always have been overlooked or missing from stores. And the agents won’t have thought of keeping these either.

Don’t get me wrong about bean-counters, I have much respect for accountants. Anyone who can accurately do and seem to enjoy what is, to me at least, a tedious and repetitive task deserves respect. My late Father was an accountant, and he harboured a dream that I should follow in his footsteps. It is probably just as well for the accountancy profession that I became an engineer.

I also have great respect for the hard-pressed maintenance engineer, who is compelled to stretch his manpower, his equipment and his budget a little further every day. These champions stretch the maintenance capability to keep the place running, while having to work with a bare minimum of resource.

I have respect for the production people too. They are forever being pushed to produce more product, in less time and at lower cost and ever increasing quality. A difficult situation, to say the least, and it means pushing the equipment to the limits and sometimes beyond.

The maintenance engineer has some powerful tools at his or her disposal. Careful condition monitoring will accurately pin-point an incipient failure in time to take the appropriate steps to at least avert disaster, even when it means implementing a hastily or unplanned shut-down for repairs.

The installation of magnetic chip detectors in a gear-box, for example, will warn that either a gear or bearing is getting past its ‘best before’ date in time for replacement parts to be installed, before the need arises to repair the collateral damage that comes with a serious in-service failure.

Correct oil analysis will do the same. Although dismantling a used oil filter to recover wear debris for analysis after a routine lubricant change is a messy business, it is not as messy as having your week-end ruined if a machine breaks when you are packing up to leave on a Friday afternoon. And wear debris, in the right hands, speaks volumes about the state of the equipment, without having to take it off-line for disassembly.

When you tell production that they will be down for a few days, they will protest vociferously, complaining that their targets will go straight out of the window. Tell them they could keep running until the whole plant goes completely belly-up, then it will be a few weeks of down-time instead of a few days, and a lot more expensive.

A good organisation will have, in its maintenance files, a selection of ‘what if’ plans that, even if they don’t provide the complete answer, will at least point in the right direction. Check them carefully, at least one will almost certainly be missing.

NASA, usually very good at the technical stuff, was very good at ‘what ifs’, too. But their engineers never spotted that the CO2 absorber canisters for the command module wouldn’t fit the system in the lunar module on Apollo 13. It nearly cost them three astronauts. With Apollo 14, they did know.

They knew about the O-ring seal problem on the space shuttle, but were told to “take off your engineering hats and put on your management hats”, with inevitable consequences. They also knew about the problems with the insulation on the external fuel tank. Both were engineering problems, but the cause of failure was management accepting the status quo and doing nothing about it.

Your systems are probably not linked to such serious consequences, and you probably don’t have a few thousand of the most talented engineers in the US on your staff, either. There’s probably just you and a few others. So I suggest giving the task of developing ‘what if’ plans to the youngest, least experienced engineers-in-training. They may not be overly familiar with the system, which improves the chances of them spotting issues you’ve missed.

Once spotted, give them the task of working out how to respond to identified scenarios. They’ll probably come up with a few bright ideas that either won’t work or would bankrupt the company. But since these are only plans, they can be changed, and the young engineers will be getting good experience of keeping equipment running in the real world.

Some of the ideas they come up with will be spot-on, though, turning out to be diamonds rather than just stones.

The opinions expressed in this column are mine and mine alone.