Dreamliner in emergency landing at Dublin Airport

Closed Thread Subscribe

Thread Tools

Search this Thread

24th Oct 2015, 13:43

#21 (permalink)

b1lanc

Join Date: Mar 2015

Location: North by Northwest

Posts: 476

Likes: 0

Received 0 Likes on 0 Posts

Quote:

Originally Posted by msbbarratt

It is possible to write bug free software, but proving that that's what's been achieved is basically impossible except in trivial examples.

Mostly we rely on a whole lot of very carefully designed testing and many hours of logged trouble-free running before reluctantly concluding that it might be ok... That's why making changes to this kind of software is so expensive - All the software tests have to be repeated.

No amount of testing ever identifies all bugs. Around 1980, Airbus shared their early static fly-by-wire flight test results with the DoD program I was supporting. If memory serves, they contracted with three different companies in three different countries to perform a full suite of testing of fly-by-wire SW with the thought that the different companies would find different bugs - at least a few. None of the companies new any of the others existed, the test protocols were unique to each company and results were not shared. The hope was each company might ferret out major flaws that others might not catch. Much to their chagrin (or maybe just due to good SW coding practices) well over 90% of flaws were found by all three companies and only a few less than critical bugs were identified uniquely. The results surprised a number of people and as events would later prove, not all major issues were uncovered.

24th Oct 2015, 16:23

#22 (permalink)

Alain67

Join Date: Mar 2015

Location: France

Posts: 29

Likes: 0

Received 0 Likes on 0 Posts

Not every software failure is a bug.
You analyse a problem.
The code you write doesn't behave like expected in some cases, that's a bug.
However, if you have forgotten some cases in your analysis, I wouldn't call that a bug. Because writing down the procedure on a sheet of paper would have produced the same error... therefore it has nothing to do, specifically, with software production.

24th Oct 2015, 19:28

#23 (permalink)

lomapaseo

Join Date: Mar 2002

Location: Florida

Posts: 4,569

Likes: 0

Received 1 Like on 1 Post

In all liklihood software didn't cause the problem, however it can work wonders to fix it though.

There is nothing new about rollback, just the assumptions about what caused it this time. Typically one looks at the input signals to the FADEC.

Mo need to even argue who screwed up, just address it and move on.

24th Oct 2015, 19:43

#24 (permalink)

red.sky@night

Join Date: Jan 2006

Location: cloud 9

Posts: 18

Likes: 0

Received 0 Likes on 0 Posts

Meanwhile, ET-ARH calls to DUB with a spare donk.

24th Oct 2015, 23:13

#25 (permalink)

G-CPTN

Resident insomniac

Join Date: Aug 2005

Location: N54 58 34 W02 01 21

Age: 79

Posts: 1,873

Likes: 0

Received 1 Like on 1 Post

Quote:

ET-ARH calls to DUB with a spare donk.

As internal cargo or on a pylon?

25th Oct 2015, 07:03

#26 (permalink)

DaveReidUK

Join Date: Jan 2008

Location: Reading, UK

Posts: 15,826

Likes: 6

Received 206 Likes on 94 Posts

Quote:

Originally Posted by G-CPTN

As internal cargo or on a pylon?

The 777 doesn't have an external engine-ferrying capability.

The GEnx fits (just) on the main cargo deck of the 777F: https://www.youtube.com/watch?v=X2jZp35BvjU

25th Oct 2015, 13:24

#27 (permalink)

roulishollandais

Join Date: Jun 2011

Location: france

Posts: 760

Likes: 0

Received 0 Likes on 0 Posts

I used the word "bug" to adress any failure in the software chain to do it shorter, I hope you will accept that word generalization inside that part of that thread.

Testing software is never a cheap operation, it is at contrary the most expensive part of building a software (1>100 is not unusual). And we have to repeat that testing some times in critical designs connected to human life or strategic aims.

That amount of work to debug really softwares is not abnormal if we accept that rule well known from the begin of software use by engineers and managers : a software is not chosen for fashion or modernity reason but to get money ! If you use your software only a little number of time it is much cheaper to do it by hand ! You have to calculate the benefit of the long, expensive, hard work creating and using software's. A sort software is used millions of times, which is much more than FADECs use. Our traditional Aeronautics ingineers had a very safe score, thank you to them.

Arrival of Personal computer induced a image of toy to the computer and the idea that everything gets magic like a play and everybody could rebuild the world in the secrecy of his office without to need acceptance by others or at least to listen their opinion about your Ptoleme's fantasma.

That experience from Airbus with DOD and some other companies is interesting but not more than that : Testing must be done by the team who elaborated the software. It is nearly impossible to debug a software yourself were not the conceiver.

Testing is a long chain of logic, with the same level of logic you used in the software conception. So you may use statistics only if you used statistics inside the software, ie theory of games to solve a great amount of equations, and respecting strticly the rules of statistics (ie if you are not sure of the distribution law, don't use tests adapted to Gaussian law).

Bug free software exists ie bootstraps and the multiple layers above the bootstrap to build operating systems or nets like the web (despite you may find some bugs in some OS...)

25th Oct 2015, 15:08

#28 (permalink)

Ian W

Join Date: Dec 2006

Location: Florida and wherever my laptop is

Posts: 1,350

Likes: 0

Received 0 Likes on 0 Posts

roulishollandais

Quote:

Testing must be done by the team who elaborated the software. It is nearly impossible to debug a software yourself were not the conceiver.

I would disagree with this as most major 'software' problems are not faults in the code but faults in the design due to poor or incorrect systems analysis. As a simple case, if the systems analyst thought that a wind of 360 at 10kts is blowing to the North not from the North and designed the software with that in mind, the software could be tested repeatedly and the fault would not be found. This is called verification testing all it does is confirm that the software does what it was designed to do without errors. Design faults though are only be found in Validation testing where the user confirms that all the functional requirements of the system perform correctly. The user tests the system as a black box - i.e. has and needs no knowledge of the design or the software. This kind of design fault through misunderstanding would only be found in validation testing. These design faults are also the most expensive faults to fix as the system is usually close to complete when validation testing is carried out and fixes will require a large amount of repeat (regression) testing to ensure the fix has not broken anything else.

Unfortunately, while it was seen as exceptionally important in the early days of computing, systems analysis and design has been somewhat trivialized in recent years into a semi-automated process and this has led to some very costly development failures.

25th Oct 2015, 17:52

#29 (permalink)

roulishollandais

Join Date: Jun 2011

Location: france

Posts: 760

Likes: 0

Received 0 Likes on 0 Posts

Ian,
Complying with the user's request needs effectively that validation from somebody who did not conceive the software which is not his profession. But that user must be implicated all along the steps of building the system since his first request to the analyst.That request cannot be just a mute folder. A real cooperation leads analyst and customer to a safe product. Sevilla's test pilots story facing in validation a wrong system , or Warner's death show the danger to ask the user to test-validate the system. Here we are walking on eggs, little steps must be validated progressively.

I agree totally with your last sentence. That fact is really worrying. We are seeing there a negative evolution called "progress"

similar to piloting technical losing basics and grounds.
rh

Last edited by roulishollandais; 25th Oct 2015 at 17:54. Reason: seeing

25th Oct 2015, 20:59

#30 (permalink)

peekay4

Join Date: Sep 2014

Location: Canada

Posts: 1,257

Likes: 0

Received 0 Likes on 0 Posts

Those pilots were effectively performing verification, not validation. They were testing whether or not their aircraft performed to specs, not whether the specs were correct.

NASA did many studies over the decades and surprisingly (?) found that it is actually impossible to find all safety-critical software bugs by testing!

That's because as complexity increases, the time required to test all possible conditions rises exponentially. Completely and exhaustively testing an entire suite of avionics software could literally take thousands of years.

Therefore, instead of full exhaustive testing, we selectively test what we determine to be the most important conditions to test. Metrics are gathered and analysis is performed to provide the required test coverage, check boundary conditions, ensure that there are no regressions, etc.

However, one can't prove that a piece of software "bug free" this way, because not all possible conditions are tested.

Today as an alternative, the most critical pieces of software are verified using formal methods (i.e., using mathematical proofs) to augment -- or entirely replace -- functional testing. Unlike testing, formal methods can prove design/implementation correctness to specifications. Unfortunately, formal methods verification is a very costly process and thus is not used for the vast majority (>99.9%) of code.

The rest of the code rely on fault-tolerance. Instead of attempting to write "zero bug" software, safety is "assured" by having multiple independent modules voting for an outcome, and/or having many defensive layers so failure of one piece of code doesn't compromise the safety of the entire system (swiss-cheese model applied to software).

This "fault-tolerance" approach isn't perfect but provides an "acceptable" level risk.

25th Oct 2015, 21:38

#31 (permalink)

roulishollandais

Join Date: Jun 2011

Location: france

Posts: 760

Likes: 0

Received 0 Likes on 0 Posts

That complexity is excessive like a human who would taste everything in the grocery before buying everyday his bread ! We are using to much software, to much energy, aso that we don't need !

Of course using tested modules and defensive layers is not far to be mandatory to chase the zero-bug level.

"99.9"... Once again that number is just terrifying . We had that discussion with PJ2 already !
Zero is really very lower than .1 !!! The limit to zero we have to search is given by the time of Planck ! Mostly people don't know which "zero" they need and decide to reach ! Having one chance/hundred to die going on the moon the first time is acceptable, but 1/100 or 1/1000 in FADEC failure in a airliner engine is not acceptable. You cannot say you did debug your software if you don't know exactly the risk and price of the failure of that software .

25th Oct 2015, 22:11

#32 (permalink)

peekay4

Join Date: Sep 2014

Location: Canada

Posts: 1,257

Likes: 0

Received 0 Likes on 0 Posts

Welcome to reality!!!!

For example, if you look at the avionics software standard (RCTA DO-178C), even at the most stringent "Level A" (where a system failure would be catastrophic == complete loss of aircraft and occupants) the test coverage is simply that each code decision has been exercised at least once and each condition resulting in those decision branches is independent. (So called MC/DC test coverage).

This does not require that all possible input / output / reference data combination have been exhaustively verified and validated.

Unfortunately the A400M crash is an example of such catastrophic failure, reportedly caused by missing reference (configuration) data.

25th Oct 2015, 23:15

#33 (permalink)

MG23

Join Date: Jun 2009

Location: Canada

Posts: 464

Likes: 0

Received 0 Likes on 0 Posts

Quote:

Originally Posted by Nialler

I've never ever seen a piece of bugfree software in 30 years of working with mission-critical systems.

You can write bug-free software, so long as it doesn't have to do anything useful. And, even if the software doesn't have bugs, the hardware it's running on does. And, even if the software doesn't have bugs and the hardware doesn't have bugs, the specifications for the software have bugs. And, even if none of them have bugs, the other systems you're talking to have bugs.

26th Oct 2015, 00:01

#34 (permalink)

lomapaseo

Join Date: Mar 2002

Location: Florida

Posts: 4,569

Likes: 0

Received 1 Like on 1 Post

Quote:

Which is likely what this thread subject is all about. They will probably just tweak the FADEC to accomodate it somehow. It will be a little harder to modify an input, but that is always an option.

26th Oct 2015, 01:43

#35 (permalink)

barit1

Join Date: Feb 2005

Location: flyover country USA

Age: 82

Posts: 4,579

Likes: 0

Received 0 Likes on 0 Posts

Quote:

And, even if the software doesn't have bugs, the hardware it's running on does.

I give you the T972 fan engine on the A380. The hardware had a small flaw (oil nozzle) that revealed a big flaw (IP turbine fatal failure mode), as QF32 revealed a couple years ago.

But intel we outsiders have is that a FADEC fix was implemented - I suspect a N2/N3 mismatch detector. It's a logical approach, but it introduces new failure modes to the system. Life is never simple, is it?

26th Oct 2015, 02:42

#36 (permalink)

tdracer

Join Date: Jul 2013

Location: Everett, WA

Age: 68

Posts: 4,424

Likes: 100

Received 180 Likes on 88 Posts

I deal with "Design Assurance Level A" or DAL A software regularly. Nearly all the "software errors" we see are not really software errors - they are requirements errors. The software is doing exactly what we told it to do in the requirements, but the requirements were not representative of what was really wanted.
What's particularly common is the requirements - as written - are not clear to the people that are implementing them. The problem is that the people writing the requirements know the system intimately, and they write requirements that are clear and make perfect sense to them - but the people who implement those requirements don't know the system and what it's expected to do, and they don't interpret those requirements as the writers intended

Barit1, on most engines, if the shaft breaks the turbine will move aft and clash with the stators - it's not pretty, but it prevents a turbine overspeed and uncontained failure (or if bits do escape, they are not "high energy" and don't do significant damage). For some reason, Rolls engines don't tend to do that. This problem showed up on the RB211-524 engine - where a few fan shafts broke - one event was on the center engine on an L1011 and the fan came down through the fuselage and tried to cut the aircraft in half. Rolls came up with a 'fan catcher' that would prevent the fan from leaving the engine. The next failure was on a 747, the fan catcher worked as intended, but the unloaded LP turbine overspeed and exploded, cutting the rear of the engine off (and peppering the aircraft with shrapnel).
The Trent engine was developed with "LPTOS" - Low Pressure Turbine OverSpeed. Basically, the FADEC monitors the LP shaft speed at both ends - and if they disagree (within a small tolerance) it will shutoff the fuel. In the aftermath of the A380 event, Rolls has been implementing "IPTOS" (Intermediate Pressure TOS) on the various Trent models.
Software is not perfect, but it has often been successfully used to address various hardware shortcomings.

The A400 crash may well be the first known accident due entirely to a problem with DAL A software. All I know about it is what I've read in news accounts and I'm anxiously awaiting the official report (hopefully Airbus/Rolls won't use the military aspects of the A400 to make the report confidential). But the news reports point to a glaring requirements error - properly designed FADEC software should have put up a 'no dispatch' warning if a critical calibration was undefined.

26th Oct 2015, 03:36

#37 (permalink)

peekay4

Join Date: Sep 2014

Location: Canada

Posts: 1,257

Likes: 0

Received 0 Likes on 0 Posts

Quote:

The A400 crash may well be the first known accident due entirely to a problem with DAL A software. All I know about it is what I've read in news accounts and I'm anxiously awaiting the official report (hopefully Airbus/Rolls won't use the military aspects of the A400 to make the report confidential). But the news reports point to a glaring requirements error - properly designed FADEC software should have put up a 'no dispatch' warning if a critical calibration was undefined.

Yea, although that might be indicative of something more than just a requirements error -- pointing to a larger process breakdown.

Typically there are high level requirements, specific system / software requirements, low-level requirements, etc., which all need to be traceable up and down between them, and also have full traceability to the code, to the binary, and to all the test cases (and/or formal methods verifications as applicable).

For all data elements, there should be specifications to check for valid ranges for values (data domain), missing values (null checks), etc. Functions also need to have preconditions & postconditions on what parameter values acceptable as part of the interface contract, and assertions which must hold true.

There should've also been models of both the specifications and the design and processes to check these models for completeness.

And even if there are data errors, as mentioned before the software should be designed to be fault-tolerant and fail safe instead of simply freezing up at 400' AGL.

What you don't want to do is to fix this one specific requirement while there may be other missing/incomplete/incorrect requirements out there. So you have to take a look into the SDLC process and figure out why the requirement was missed to begin with.

26th Oct 2015, 03:51

#38 (permalink)

wanabee777

Join Date: Jan 2006

Location: Ijatta

Posts: 435

Likes: 0

Received 0 Likes on 0 Posts

All this software talk is starting to make me long for plain old Jurassic J57 or JT8D technology.

Last edited by wanabee777; 27th Oct 2015 at 06:53.

26th Oct 2015, 11:52

#39 (permalink)

Magplug

Join Date: Oct 1999

Location: LHR

Posts: 556

Likes: 0

Received 0 Likes on 0 Posts

I witnessed the Ethiopian 787 arrival in to DUB recently and ATC at that awful place never cease to amaze me........

The arrival of the 787 with an IFESD was known well in advance as he was returning off the ocean. We arrived in the 1/2 hour before he landed and all the airport & external units for the emergency were already in place around the airport.

The Ethiopian 787 landed without incident, vacated, talked to the Fire Service and when all was established as safe they elected to proceed to the terminal. Up to that point it all seemed to be handled rather well...... (Congratulations lads!).

With the Fire Service following the 787 was directed by ATC to a stand on the main terminal. For reasons best known to themselves the Fire Service then rolled out multiple hoses around the area in preparation to fight any fire they might have missed earlier. As a precautionary measure this would have been all well and good except that ATC had parked the aircraft on stand 204 in very close proximity to a BA A320 being refuelled and boarded for departure. The sight of the fire-fighters rolling out their hoses around an aircraft with 200 pax on board that was blissfully loading fuel from an airport bowser was rather comical.

I probably visit 20-30 different European airports every month, some of them with diverse terrain, ATC or cultural challenges. However DUB has to be near the top of my dangerous-avoid list. The ground environment is absolutely appalling with no less than 4 different taxiway nomenclatures guaranteed to confuse the visiting pilot with clearances delivered at break-neck speed. There is no recognition that visitors to DUB might be complete strangers and if you question a clearance the reply illicits a degree of arrogance found in very few places elsewhere. I can quite understand why they have a history of ground incidents.

There is little logic to the way things happen in DUB. Whilst the rest of the world is pretty much standard-ICAO Dublin carries on business in it's own little bubble.

26th Oct 2015, 13:15

#40 (permalink)

oldoberon

Join Date: Mar 2014

Location: wales

Age: 81

Posts: 316

Likes: 0

Received 0 Likes on 0 Posts

Quote:

Originally Posted by DaveReidUK

The 777 doesn't have an external engine-ferrying capability.

The GEnx fits (just) on the main cargo deck of the 777F: https://www.youtube.com/watch?v=X2jZp35BvjU

wow a tight fit indeed, can think of a few places where you wouldn't want local loaders putting that on/off.

Closed Thread Share

First
Prev
2 / 4
Next
Last