PPRuNe Forums - View Single Post - Dreamliner in emergency landing at Dublin Airport

26th Oct 2015, 13:56

#42 (permalink)

Ian W

Join Date: Dec 2006

Location: Florida and wherever my laptop is

Posts: 1,350

Likes: 0

Received 0 Likes on 0 Posts

Quote:

Originally Posted by peekay4

Those pilots were effectively performing verification, not validation. They were testing whether or not their aircraft performed to specs, not whether the specs were correct.

NASA did many studies over the decades and surprisingly (?) found that it is actually impossible to find all safety-critical software bugs by testing!

That's because as complexity increases, the time required to test all possible conditions rises exponentially. Completely and exhaustively testing an entire suite of avionics software could literally take thousands of years.

Therefore, instead of full exhaustive testing, we selectively test what we determine to be the most important conditions to test. Metrics are gathered and analysis is performed to provide the required test coverage, check boundary conditions, ensure that there are no regressions, etc.

However, one can't prove that a piece of software "bug free" this way, because not all possible conditions are tested.

Today as an alternative, the most critical pieces of software are verified using formal methods (i.e., using mathematical proofs) to augment -- or entirely replace -- functional testing. Unlike testing, formal methods can prove design/implementation correctness to specifications. Unfortunately, formal methods verification is a very costly process and thus is not used for the vast majority (>99.9%) of code.

The rest of the code rely on fault-tolerance. Instead of attempting to write "zero bug" software, safety is "assured" by having multiple independent modules voting for an outcome, and/or having many defensive layers so failure of one piece of code doesn't compromise the safety of the entire system (swiss-cheese model applied to software).

This "fault-tolerance" approach isn't perfect but provides an "acceptable" level risk.

Exhaustive testing: Is when either the tester or the funds are exhausted, it has no bearing on the number of bugs yet to be found.

Mathematical proof of software is an example of the 'streetlight effect' more and more effort being expended looking for bugs in an area where they are simple to find but very unlikely - in the code that can be mathematically checked, rather than where they most often are which is in system design. However, it makes some companies a lot of money and delays and even prevents implementation of modern hardware and software.

Fault tolerance by voting triplex is fine until there is a three way disagreement and/or the voting software makes a mistake and shuts down the process whose software is correct and follows the output of the two other processes whose software is incorrect. This happens surprisingly often.