We have studied hundreds of sets of field failure data from various sources here at exida. Some of these data sets have indicated differences in failure rates by over an order of magnitude for the same product type. After tracing through the data collection process for many of these field failure data sets, it is becoming clear that one significant variable is the question “What is a random failure?”
Very few failures are recorded in some data studies. It was discovered in one study that many of the “possible failures” were classified as “systematic” and therefore not counted in the random failure rate. In a January 2014 presentation from a well known Certification Body from Germany, the presenter explained how “most mechanical failures are systematic.” I heard the same comment from a valve manufacturer. The reasoning was explained in several examples:
- This valve failed because it was not designed for the application. Therefore it is the customer’s problem, a systematic failure caused by a bad design process at the customer site.
- This actuator failed because of a defect in the manufacturing process, a systematic failure of the manufacturing process design.
- This valve failed because the product designer did not use enough design strength, a systematic failure of the requirements specification.
This view is especially true in engineers working for manufacturers. I worked in new product development for a manufacturing company for many years. We did not design products to fail at random times. There was no product requirement to generate random failures in any requirements document that I have ever seen. We designed the products to work to the best of our ability. Therefore we often concluded that most failures are systematic in one way or another. However this view is very unrealistic.
The IEC 61508 standard defines a random failure as “A failure occurring at a random time, which results from one or more degradation mechanisms.” This applies nicely to many of the things we really see including:
- Bad air
- Inadvertent poor selection of materials
- Material defects
- Random bad power events
- Inadvertent environmental stress
- Accidental calibration errors
- Inadvertent maintenance mistakes
- Random failures in a manufacturing process, etc.
And all failures due to these realistic causes should be considered when estimating random failure rates.
The purpose of a random failure rate is to realistically calculate the probability of a set of safety equipment to fail to perform a safety function. We at exida use a very realistic definition of random failure. However we do separate product specific random failures from site specific random failures. Failure Modes Effects and Diagnostic Analysis (FMEDA) results include only product specific failures. We model site specific random failures in addition to product specific random failures in our exSILentia tool based on site conditions.
However we do not depend on the words in a definition as they can too easily be misinterpreted. We’d like you to see for yourself. Please take a few minutes to complete our “Random vs. Systematic” survey, and continue to check back on exida explains for the results and explanations.