If you were going to build a bridge, you would want to make sure that it did not fall down if there were too many cars on the bridge. One way that this is accomplished is to overdesign. If the bridge is expected to hold at most 20,000 pounds, it should be designed to hold 40,000 pounds. That way the bridge will still hold strong if the estimated capacity is exceeded. The same concepts apply to electrical components, and in fact for safety systems, IEC 61508 requires that this overdesign be done. This practice of limiting electrical, thermal, and mechanical stresses on electrical parts to levels below their specified rating is called de-rating. If a product is expected to reliably operate, one of the contributing factors must be a conservative design approach. Components operating significantly below their strength level will result in decreased failure rates and will potentially increase the components’ life.
This classic technique has been around for years, but I am often asked what must be done to prove that a product has been de-rated sufficiently. Do you have to go as far as listing every component on the board and showing its typical operating levels vs. their rated levels and then calculate how much each component has been de-rated? This is exactly the method taught in the 1970s, and is still used in the nuclear industries. Or is it just sufficient to show that you have a process for de-rating in place?
Realistically, the answer lies between these two extremes. Having a procedure in place is certainly a start, but how do you know that the procedure was followed? One technique used to optimize this process is thermal imaging of a prototype board. If a thermal image is taken of a board during normal operation, it can be shown which components are operating hot. Components that have negligible self-heating can be considered as sufficiently de-rated, while components that are operating hotter should be analyzed further with their de-rating percentages documented and analyzed. Any components that are shown to not meet the de-rating requirements must be justified.
Another technique is to use the FMEDA (Failure Modes Effects and Diagnostics) process to determine where to focus the de-rating analysis. From a safety point of view, we are mainly concerned with dangerous undetected failures. These are failures that will cause a loss of the safety function and are not detected by diagnostics. Therefore, the problem will not be known until the safety function is needed and fails to execute. The FMEDA technique will identify all failure modes of components that will result in dangerous undetected failures. This includes failures that are not detected by diagnostics at all as well as failures that are not always detected by diagnostics (Diagnostic Coverage < 100%). Many component failures are safe (they cause the safety function to falsely trip) or have no effect on the safety function. By focusing the de-rating analysis only on components that contribute to the dangerous undetected failure rate, the analysis will be limited to the items that can adversely affect the safety of the product. As a result, the time spent on this analysis is optimized to do the most good from a safety point of view.
Overall, the de-rating process is a well proven technique to improve the reliability and safety of a product. For safety systems, focusing the analysis on the areas that will affect the safety of the product is a good way to optimize your use of this technique.
Tagged as: michael medoff fmeda failure modes effects and diagnostics diagnostic coverage de-rating