Go Back  PPRuNe Forums > Flight Deck Forums > Rumours & News
Reload this Page >

Dreamliner in emergency landing at Dublin Airport

Rumours & News Reporting Points that may affect our jobs or lives as professional pilots. Also, items that may be of interest to professional pilots.

Dreamliner in emergency landing at Dublin Airport

Old 26th Oct 2015, 14:20
  #41 (permalink)  
 
Join Date: Mar 2014
Location: wales
Age: 81
Posts: 316
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by beardy
IF it's a software problem AND it's the same software on both engines I would have expected that to impinge on ETOPs certification since the risk of the second engine doing the same thing is higher than if it were a mechanical (as opposed to design) fault. ETOPs is defined by acceptable risk of the other engine failing within a set time period and whilst demonstrated failure rate is a very good metric it should not, IMHO, be considered in isolation.

when put that way it is so logical, i do hope certifying bodies used same logic. long time since I did a long flight over water, and always preferred 4 to 2, having read your post that preference will remain in place.
oldoberon is offline  
Old 26th Oct 2015, 14:56
  #42 (permalink)  
 
Join Date: Dec 2006
Location: Florida and wherever my laptop is
Posts: 1,350
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by peekay4
Those pilots were effectively performing verification, not validation. They were testing whether or not their aircraft performed to specs, not whether the specs were correct.

NASA did many studies over the decades and surprisingly (?) found that it is actually impossible to find all safety-critical software bugs by testing!

That's because as complexity increases, the time required to test all possible conditions rises exponentially. Completely and exhaustively testing an entire suite of avionics software could literally take thousands of years.

Therefore, instead of full exhaustive testing, we selectively test what we determine to be the most important conditions to test. Metrics are gathered and analysis is performed to provide the required test coverage, check boundary conditions, ensure that there are no regressions, etc.

However, one can't prove that a piece of software "bug free" this way, because not all possible conditions are tested.

Today as an alternative, the most critical pieces of software are verified using formal methods (i.e., using mathematical proofs) to augment -- or entirely replace -- functional testing. Unlike testing, formal methods can prove design/implementation correctness to specifications. Unfortunately, formal methods verification is a very costly process and thus is not used for the vast majority (>99.9%) of code.

The rest of the code rely on fault-tolerance. Instead of attempting to write "zero bug" software, safety is "assured" by having multiple independent modules voting for an outcome, and/or having many defensive layers so failure of one piece of code doesn't compromise the safety of the entire system (swiss-cheese model applied to software).

This "fault-tolerance" approach isn't perfect but provides an "acceptable" level risk.
Exhaustive testing: Is when either the tester or the funds are exhausted, it has no bearing on the number of bugs yet to be found.

Mathematical proof of software is an example of the 'streetlight effect' more and more effort being expended looking for bugs in an area where they are simple to find but very unlikely - in the code that can be mathematically checked, rather than where they most often are which is in system design. However, it makes some companies a lot of money and delays and even prevents implementation of modern hardware and software.

Fault tolerance by voting triplex is fine until there is a three way disagreement and/or the voting software makes a mistake and shuts down the process whose software is correct and follows the output of the two other processes whose software is incorrect. This happens surprisingly often.
Ian W is offline  
Old 26th Oct 2015, 15:01
  #43 (permalink)  
 
Join Date: Dec 2006
Location: Florida and wherever my laptop is
Posts: 1,350
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by oldoberon
when put that way it is so logical, i do hope certifying bodies used same logic. long time since I did a long flight over water, and always preferred 4 to 2, having read your post that preference will remain in place.
Unfortunately, if all four have the same software version then all four could in theory crash and such faults do happen even on fully tested systems. Such as the F-22 Squadron Shot Down by the International Date Line
Ian W is offline  
Old 26th Oct 2015, 16:38
  #44 (permalink)  
 
Join Date: Mar 2014
Location: wales
Age: 81
Posts: 316
Likes: 0
Received 0 Likes on 0 Posts
yes most of them been around longer so in theory SW more proven ( I hope).

your link wow close one!!
oldoberon is offline  
Old 26th Oct 2015, 17:10
  #45 (permalink)  
 
Join Date: Sep 2014
Location: Canada
Posts: 1,257
Likes: 0
Received 0 Likes on 0 Posts
There is an overarching software design & architecture requirement that any "catastrophic failure" -- a failure resulting in the loss of the airplane and deaths of its occupants -- must be "extremely improbable".

For FAR 25 aircraft, "extremely improbable" is defined as a failure rate of no more than 1 per billion flight hours (1E-9), established by a quantitative safety assessment.

However, as we found out with the Challenger shuttle disaster, this kind of quantitative assessment can be a bit pie in the sky. Still, critical software do tend to be extremely reliable. Just remember to reboot from time to time........
peekay4 is offline  
Old 26th Oct 2015, 17:20
  #46 (permalink)  
 
Join Date: Mar 2002
Location: Florida
Posts: 4,569
Likes: 0
Received 1 Like on 1 Post
For FAR 25 aircraft, "extremely improbable" is defined as a failure rate of no more than 1 per billion flight hours (1E-9), established by a quantitative safety assessment.
In General

To put this into perspective catastrophic failures (part 25) for all causes occur at rates 100 times more likely (1E-7).

I'm a lot less worried about the system causing the crash then I am the pilot's contribution
lomapaseo is offline  
Old 26th Oct 2015, 17:32
  #47 (permalink)  
 
Join Date: Feb 2004
Location: Dublin, Ireland
Posts: 486
Received 5 Likes on 4 Posts
There is little logic to the way things happen in DUB. Whilst the rest of the world is pretty much standard-ICAO Dublin carries on business in its own little bubble.
Most of the issues you raise about taxiway nomenclature and layout and also where the Ethiopian 787 was directed to park can hardly be laid at the door of ATC. Those are all matters for the airport authority. Amongst the recommendations of a recent AAIU report into a ground collision between two 737s at Dublin was that:

"The Dublin Airport Authority (DAA) conduct a critical review of the taxiway system at Dublin Airport, to ensure that taxiway routes are as simple as possible in order to avoid pilot confusion and the need for complicated instructions."

The report also states that Dublin Airport accepts the recommendation and will undertake a critical review of the taxiway system to ensure that taxiway routes are as simple as possible.
Liffy 1M is offline  
Old 26th Oct 2015, 19:37
  #48 (permalink)  
 
Join Date: Sep 2015
Location: Nice
Posts: 19
Likes: 0
Received 0 Likes on 0 Posts
Exhaustive testing: Is when either the tester or the funds are exhausted, it has no bearing on the number of bugs yet to be found.

Nothing new. The problem of bugs in systems has always been with us in design, it is just that computers probably give more opportunities for error.

Ask the captain of the BA flight that lost two donkeys on short finals into LHR. That bug in the fuel system had managed to hide itself for several years, before raising its ugly head.


tatelyle is offline  
Old 27th Oct 2015, 00:31
  #49 (permalink)  
 
Join Date: Jan 2015
Location: Delete me
Age: 58
Posts: 28
Likes: 0
Received 0 Likes on 0 Posts
I've been a software developer for 26 years and 100% bug free software can simply mean you and whoever tested the software both misinterpreted the spec in the same way, OR the analyst misinterpreted the requirements and you coded their mistake perfectly and the tester agreed. This is (literally, not kidding) why I genuinely fear passengering on an Airbus. Nothing can ever replace you guys and we shouldn't be trying.
Infieldg is offline  
Old 27th Oct 2015, 16:36
  #50 (permalink)  
AR1
 
Join Date: May 2007
Location: Nottinghamshire
Age: 63
Posts: 710
Received 4 Likes on 1 Post
You really need to get out more.

Before SW failure was mechanical failure. And that's not gone away either. Despite the way software can never be 100% bug free (not my assertion) you fly in an era of unprecedented safety in air travel.

Unfortunately those same technical advances also give us the ability to spout tripe in an unprecedented way. And that scares me.
AR1 is offline  
Old 27th Oct 2015, 17:41
  #51 (permalink)  
 
Join Date: Jun 2009
Location: Canada
Posts: 464
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by AR1
Before SW failure was mechanical failure. And that's not gone away either.
But you can usually detect mechanical problems early: for example, wear or cracks in metal parts. Software may work perfectly for ten years, then finally hit the rare bug that causes it to fail for no apparent reason. Worse, every instance of that software may fail at the same time all over the world (e.g. the various leap second bugs). Gears don't even know about leap seconds.

The other issue is that a third party can examine all the mechanical parts for cracks, and tell you there's a problem. A third party usually can't examine the software that runs those parts, because it's closed source. They can only test it as a black box.

There's a truly scary document online from one of the software guys who was given access to Toyota's software as an expert witness for the 'unintended acceleration' trials. Some of the things in there are quite mind-boggling, but no-one knew about them because they had no access to the software.

Software has definitely made many things far more reliable. But it's also replaced many predictable failures with unpredictable ones.
MG23 is offline  
Old 27th Oct 2015, 19:07
  #52 (permalink)  
 
Join Date: Nov 2000
Location: UK
Age: 69
Posts: 1,395
Received 40 Likes on 22 Posts
Who judges the acceptable risk of software bugs?
beardy is offline  
Old 27th Oct 2015, 19:24
  #53 (permalink)  
 
Join Date: Aug 2013
Location: Washington.
Age: 73
Posts: 1,051
Received 143 Likes on 48 Posts
Software Failure?

I doubt there is any such thing. Software just is whatever has been coded. Unless the memory media on which it is stored fails somehow, the software remains intact, just as coded and compiled.

Software error? Yes, and as so many have explained, testing to find every software error (AKA bug) is fairly impractical in the large bodies of complex code used in avionics, engines and such. So rather than completely exhaustive testing, though some testing indeed is done, there is required to be a disciplined software development process, the rigor of which is driven by the safety effects that a function affected by software error might be considered to have.

Highly critical functions with potentially catastrophic effects from software errors must have a "design assurance level" of A, which of course is the highest and most expensive development process.
GlobalNav is offline  
Old 27th Oct 2015, 21:26
  #54 (permalink)  
 
Join Date: Jan 2007
Location: San Jose
Posts: 727
Likes: 0
Received 0 Likes on 0 Posts
But you can usually detect mechanical problems early: for example, wear or cracks in metal parts. Software may work perfectly for ten years, then finally hit the rare bug that causes it to fail for no apparent reason.
It can happen with mechanical stuff too, Air Midwest 5481 back in 2003. Someone introduced a mechanical 'bug' in that they rigged the elevator cables incorrectly. It flew OK for several flights until circumstances conspired to trip the bug, in the form of a CofG too far aft and it pitched up and stalled. OK, not quite ten years, but no one detected the error and had the error not been made, it would have been recoverable - the limited elevator travel due to the error meant it couldn't cope.
llondel is offline  
Old 28th Oct 2015, 03:46
  #55 (permalink)  
 
Join Date: Jun 2011
Location: france
Posts: 763
Likes: 0
Received 0 Likes on 0 Posts
Snoop

Originally Posted by beardy
Who judges the acceptable risk of software bugs?
ref to Ariane 501 report (4.June1996 crash) from Jacques Louis Lions about the best practices.
roulishollandais is offline  
Old 28th Oct 2015, 05:10
  #56 (permalink)  
 
Join Date: Sep 2008
Location: Where it is comfortable...
Age: 60
Posts: 909
Received 13 Likes on 2 Posts
Software just is whatever has been coded.
That is very simplistic and incorrect. Software comprises the original set of specifications on what the system is supposed to achieve, the algorithm (which is a translation of the specs into the particulars of the coding language used), the actual code, the set of static and dynamic data which are used by the code, and the user instructions/manual on how to operate the software.


"Bugs" can be introduced everywhere along this process, and coding bugs (where there is an actual syntax or logical error in the code) are usually the smallest percent of them, and the easiest to catch. The most difficult part are the specifications, where a professional in a particular subject needs to describe his/her knowledge to someone who is at best marginally versed in the profession, however is able to develop efficient algorithms to achieve what the specifications say. There are many things which may get lost in translation here, and the most dangerous are which were 'forgotten' from the specifications simply because a particular scenario was not considered. These scenarios are usually in the realm of valid data, as basic software design principles mandate that invalid data ranges must be considered and treated (eg. if a parameter must be positive, in a critical system there MUST be a loop which handles the case if that parameter is negative).


A further layer of "bugs" are as Microsoft once famously said, not bugs but features. Errors can be introduced in the user manual which may not correctly describe how the system works, especially in remote and unlikely scenarios. This causes the software to behave as specified, but differently than what users expect. More issues are introduced through the user interface, when the software users do things which are explicitly disallowed in the manual, but try it anyway, with totally unpredictable outcomes as those scenarios were neither considered nor tested.


From the user perspective all above are "bugs", but only a very small portion are actually attributable to the code itself.
andrasz is offline  
Old 28th Oct 2015, 13:38
  #57 (permalink)  
 
Join Date: Jan 2010
Location: Edinburgh
Age: 85
Posts: 72
Likes: 0
Received 5 Likes on 3 Posts
Whenever I wrote in a manual "Whatever you do, don't press button 'A'", I had to go back to the product design and delete or protect button 'A'. Eventually, I got round to writing the manual before I started the design. That only took half a lifetime to figure out, but then I'm not the sharpest knife in the box!
DType is offline  
Old 28th Oct 2015, 16:08
  #58 (permalink)  
 
Join Date: Jan 2006
Location: Ijatta
Posts: 435
Likes: 0
Received 0 Likes on 0 Posts
I never could keep my fingers off the bloody buttons. Especially on long haul flights.

Used to drive my F/O's nuts!

Last edited by wanabee777; 28th Oct 2015 at 20:20.
wanabee777 is offline  
Old 28th Oct 2015, 18:03
  #59 (permalink)  
 
Join Date: May 2008
Location: Paris
Age: 59
Posts: 101
Likes: 0
Received 0 Likes on 0 Posts
@peekay4:

Yea, although that might be indicative of something more than just a requirements error -- pointing to a larger process breakdown.

Typically there are high level requirements, specific system / software requirements, low-level requirements, etc., which all need to be traceable up and down between them, and also have full traceability to the code, to the binary, and to all the test cases (and/or formal methods verifications as applicable).

For all data elements, there should be specifications to check for valid ranges for values (data domain), missing values (null checks), etc. Functions also need to have preconditions & postconditions on what parameter values acceptable as part of the interface contract, and assertions which must hold true.

There should've also been models of both the specifications and the design and processes to check these models for completeness.

And even if there are data errors, as mentioned before the software should be designed to be fault-tolerant and fail safe instead of simply freezing up at 400' AGL.

What you don't want to do is to fix this one specific requirement while there may be other missing/incomplete/incorrect requirements out there. So you have to take a look into the SDLC process and figure out why the requirement was missed to begin with.
YOu may have worked in the past with Orthogonal Defect Classification. This is where things get scary. In nailing down a coding error at one stage we drilled through to the conclusion that the error was a "missing typo". At the meeting we collapsed in laughter. The problem essentially conisted of the fact that a typo hadn't been propagated right throughout the development cycle. When we recovered ourselves we realised how utterly catastrophic such an error might be.

With teams using US and UK ENglish there were multiple risks of variable typos, with each being separately close enought to the other to pass muster, but with yet untested fallback routines failing in th event.

Avionic software at least appears to fall back to the backstop of handing things over to the pilot(s). The day that they stop doing so is the day that I keep my feet on the ground.

Systems are never perfect, and they don't exist in a vacuum; parallel systems may make un desired demands of them.

I'm not flying hen the person in the seat is a systems administrator; I want a pilot up there. One who can over-ride every damn system. Yes, they make mistakes, but at least they can react according to their skills, and at least their ass is on the line too.
Nialler is offline  
Old 29th Oct 2015, 06:21
  #60 (permalink)  
 
Join Date: Apr 2007
Location: moraira,spain-Norfolk, UK
Age: 82
Posts: 389
Likes: 0
Received 0 Likes on 0 Posts
Challenger disaster

Hello peekay4,
I think you will find the Challenger disaster was in the numbers.
The relevant engineers voted against flight. It flew because of the
common management idea that if it (they) flew several times then they were OK.
esa-aardvark is offline  

Thread Tools
Search this Thread

Contact Us - Archive - Advertising - Cookie Policy - Privacy Statement - Terms of Service

Copyright © 2024 MH Sub I, LLC dba Internet Brands. All rights reserved. Use of this site indicates your consent to the Terms of Use.