To: lomapaseo
So what real failure rate was the unexpected Achilies heel in this accident ?
First of all, I was not involved in the Alaska Air incident or with McDonnell Douglas so I can only comment on the system that is used to predict the reliability and ultimate safety of an aircraft or any other system. Explaining Reliability and Safety is extremely difficult unless the person addressed is familiar with the development of a Fault Tree Analysis (FTA) and if they are familiar with the development of Reliability block diagrams (RBD) and Failure Modes Effects Criticality Analyses (FMECA).
The mythical failure rate of 10 9 can be addressed two ways. The FARs require that a single point failure that can contribute to the loss of an aircraft can occur no more frequently than 10 9 and if all possible it should be designed out. The 10 9 figure most people quote does not apply to the aircraft level but instead, it applies to system failure that can cause loss of the aircraft. The FARs and the JARs will specify the acceptable frequency of a system failure that can cause a minor problem to loss of the aircraft and the upper limit is usually 10 9.
Here is how the system relies on the manipulation of numbers. If, the regulations specify that a flap or slat system can lock-up no more frequently than 10 6, the reliability engineer is forced to use non realistic failure rates for the individual components that can cause lock-up as there may be several hundred in the respective systems whose failure can result in lock-up. Where do these failure rates come from? Mostly from government developed databases that contain several hundred items that may or may not be used on aircraft. The reliability engineer must select a usable failure rate from an item that is used in a submarine and multiply it by an environmental K factor to obtain the failure rate he needs for his aircraft. In some cases the failure rate will have an upper, median and lower level of confidence. He is free to pick whichever confidence level best fits his calculation and ultimately arrives at the desired failure rate of 10 6. If the requirement is for runaway or non movement when commanded is 10 9 then the search for useable failure numbers becomes even more ridiculous.
So now after making the reliability calculation using non-realistic numbers the Reliability Engineer passes them to the Systems Safety Engineer. The Systems Safety Engineer then creates a FTA, which, is made up of gates the most common of which are And gates & Or gates. The diagram is from the top down meaning that the top gate is the actual failure resulting in breaching the 10 9 requirement. The top gate is connected to the lower gates by connection lines or, failure paths. The failure paths leading to the top gate come from either And gates or, Or gates and each of those gates represents a failure that can lead upward to lead to the breach of the 10 9 requirement. There can be as many gates of both kinds to reflect the system complexity. There may be as many FTAs as required to reflect all of the services that supply the system such as Hydraulics, Electrical and electronics and the hardware elements of the system.
Imagine the gates as being locks. On an Or gate there can be several failures each of which has a key to the lock and any one of these failure can pass through the gate. On an And gate each of the failures have key to the lock but all must be present in order for the collective failure to pass through the gate. This is a simplification but easy to understand. Now for the mathematics (Boolean Algebra). Let’s assume that an Or gate has five failures each of which can open the lock. The math is 1 10 6+1 10 6+1 10 6+ 1 10 6=1 10 6 with a result of 5 10 6. Using the same numbers let’s look at an AND gate. The math is 1 10 6x1 10 6x1 10 6x 1 10 6x1 10 6x with a result of 1 10 30. The numbers are unrealistic because they stem from numbers that have no relevance to the calculations and secondly they are unrealistic when you compare them to actual recorded failures.
Here is the final kicker. The FTAs are for the systems and not the aircraft. Each FTA terminates in assessing the probability of failure of the specific system. This should be carried one step further by making a FTA with an OR gate representing the aircraft with each of the systems feeding into that gate. Because it is an Or gate you would most likely come up with a catastrophic loss of something in the area of 1 10 8 or possibly lower which truly reflects the crash rate of commercial aircraft. However the FAA does not require this assessment at the aircraft level. So much for safety.
One final note. In order to gain certification the FTA and the attendent report are required among other engineering reports but the FAA never sees the RBD or the FMECA and they are not at all concerned with how the failure rates were derived. If they ever express interest in seeing these documents they must request permission from the manufacturer.