New Software Issues Found on the MAX
Join Date: Apr 2015
Location: Under the radar, over the rainbow
Posts: 790
Likes: 0
Received 0 Likes
on
0 Posts
@clearedtocross has it:
Quote:
Originally Posted by Fly Aiprt View Post
...what is unclear is why tests in a real aircraft come so late into the development timeline ?...
Quote:
Originally Posted by clearedtocross View Post
The problem with interactive computer systems with both synchronous and async interactions is that any amount of testing does not make sure they always work. You have to get it right by design, not by testing...
Quote:
Originally Posted by Fly Aiprt View Post
...what is unclear is why tests in a real aircraft come so late into the development timeline ?...
Quote:
Originally Posted by clearedtocross View Post
The problem with interactive computer systems with both synchronous and async interactions is that any amount of testing does not make sure they always work. You have to get it right by design, not by testing...
The situation in which Boeing and U.S. aviation in general find themselves today is one in which announcing vaporware is probably a serious mistake.
For those who were trained to fly rather than to write real time software just a little example that shows the problem:
Imagine a single track railway connects two very remote stations A and B where there is usually only one or two trains travelling in each direction per day. The single track is protected by a red light at both ends which usually shows red as default. When the driver at A wants to leave for B, he presses a "start" button and gets a green light at A while the light at B remains red even if driver at B presses the button too. When the driver A arrives at B, he presses the "end" button to release the line (and the lights). Obviously a driver at B would do the same in the opposite direction. Now this is tested and it works perfectly, again and again... Until one day, the buttons at both ends are pressed in exactly the same moment (lets discard Einsteins relativity theory and Heisenbergs uncertiness). What will happen? It depends on the guy who programmed the light control systems. If both lights remain red, you will get angry drivers. If both lights go green, you will get dead drivers and SLF. So the programmer must have thought about this possible problem and implemented some solution (like priority scheduling, look ahead locking etc.).
This is what I meant when I wrote about making the design safe is vital before something gets tested because tests will not always reveal unlikely but still possible events (like the failure of a sensor) . And in a complex system, its far from easy and not to be done in a hurry.
Imagine a single track railway connects two very remote stations A and B where there is usually only one or two trains travelling in each direction per day. The single track is protected by a red light at both ends which usually shows red as default. When the driver at A wants to leave for B, he presses a "start" button and gets a green light at A while the light at B remains red even if driver at B presses the button too. When the driver A arrives at B, he presses the "end" button to release the line (and the lights). Obviously a driver at B would do the same in the opposite direction. Now this is tested and it works perfectly, again and again... Until one day, the buttons at both ends are pressed in exactly the same moment (lets discard Einsteins relativity theory and Heisenbergs uncertiness). What will happen? It depends on the guy who programmed the light control systems. If both lights remain red, you will get angry drivers. If both lights go green, you will get dead drivers and SLF. So the programmer must have thought about this possible problem and implemented some solution (like priority scheduling, look ahead locking etc.).
This is what I meant when I wrote about making the design safe is vital before something gets tested because tests will not always reveal unlikely but still possible events (like the failure of a sensor) . And in a complex system, its far from easy and not to be done in a hurry.
Join Date: Dec 2006
Location: Florida and wherever my laptop is
Posts: 1,350
Likes: 0
Received 0 Likes
on
0 Posts
Yes, I think most of us here understand the above. The point I've been trying to make is that you learn whether or not you've gotten it right by testing and, if your system is failing to run, or failing, a POST/initialization check, you should probably notice that well before anyone on the team suggests that "it's finished/almost finished."
The situation in which Boeing and U.S. aviation in general find themselves today is one in which announcing vaporware is probably a serious mistake.
The situation in which Boeing and U.S. aviation in general find themselves today is one in which announcing vaporware is probably a serious mistake.
Join Date: Mar 2019
Location: French Alps
Posts: 326
Likes: 0
Received 0 Likes
on
0 Posts
Thanks for all who responded and provided examples.
Indeed my Java teachers taught us to consider and catch exceptions.
The nagging question is, considering the differences between their "engineering cab" and a real airplane, why were the real software flight tests performed so late (december)?
What kept them from flying the thing and doing ramp tests 6 or 9 months ago?
Or is it another aspect of the "no need to fly the real thing before" mentality?
"No need for sim time, just a tablet", "no need to test fly, just run the cab"...
Makes you wonder..
Indeed my Java teachers taught us to consider and catch exceptions.
The nagging question is, considering the differences between their "engineering cab" and a real airplane, why were the real software flight tests performed so late (december)?
What kept them from flying the thing and doing ramp tests 6 or 9 months ago?
Or is it another aspect of the "no need to fly the real thing before" mentality?
"No need for sim time, just a tablet", "no need to test fly, just run the cab"...
Makes you wonder..
Join Date: Apr 2015
Location: Under the radar, over the rainbow
Posts: 790
Likes: 0
Received 0 Likes
on
0 Posts
The Boeing engineers did not have the luxury of starting from a clean sheet and "getting it right by design". They had stiffware that has been operational for years and had to modify the code so that the FCCs operated in a different way without changing anything that was not essential to change for the task at hand and without breaking any current functions. Maintenance programming especially of embedded code and modifying the code so it does things differently without affecting anything else is nothing like simple code writing. It is possible that some very basic timing issue made the live aircraft slightly different to the avionics test bench. This is the reason regression tests are run when the new code is ported to and implemented in the aircraft - and the tests found an issue - that is what the tests are for.
Join Date: Jan 2008
Location: Wintermute
Posts: 76
Likes: 0
Received 0 Likes
on
0 Posts
Thanks for all who responded and provided examples.
Indeed my Java teachers taught us to consider and catch exceptions.
The nagging question is, considering the differences between their "engineering cab" and a real airplane, why were the real software flight tests performed so late (december)?
What kept them from flying the thing and doing ramp tests 6 or 9 months ago?
Or is it another aspect of the "no need to fly the real thing before" mentality?
"No need for sim time, just a tablet", "no need to test fly, just run the cab"...
Makes you wonder..
Indeed my Java teachers taught us to consider and catch exceptions.
The nagging question is, considering the differences between their "engineering cab" and a real airplane, why were the real software flight tests performed so late (december)?
What kept them from flying the thing and doing ramp tests 6 or 9 months ago?
Or is it another aspect of the "no need to fly the real thing before" mentality?
"No need for sim time, just a tablet", "no need to test fly, just run the cab"...
Makes you wonder..
Join Date: Mar 2019
Location: French Alps
Posts: 326
Likes: 0
Received 0 Likes
on
0 Posts
But does that imply that it is valid to defer an aircraft critical software testing in the real airplane until the last moment?
That is if Boeing now considers the MCAS as a safety critical software.
Thanks for all who responded and provided examples.
Indeed my Java teachers taught us to consider and catch exceptions.
The nagging question is, considering the differences between their "engineering cab" and a real airplane, why were the real software flight tests performed so late (december)?
What kept them from flying the thing and doing ramp tests 6 or 9 months ago?
Or is it another aspect of the "no need to fly the real thing before" mentality?
"No need for sim time, just a tablet", "no need to test fly, just run the cab"...
Makes you wonder..
Indeed my Java teachers taught us to consider and catch exceptions.
The nagging question is, considering the differences between their "engineering cab" and a real airplane, why were the real software flight tests performed so late (december)?
What kept them from flying the thing and doing ramp tests 6 or 9 months ago?
Or is it another aspect of the "no need to fly the real thing before" mentality?
"No need for sim time, just a tablet", "no need to test fly, just run the cab"...
Makes you wonder..
Want to learn programming - C or assembler or hand code machine language. Java is to programming what MS Flight Simulator is to an F-15 Eagle.
Join Date: Apr 2019
Location: Toronto
Posts: 20
Likes: 0
Received 0 Likes
on
0 Posts
No need to pile on Fly Aiprt. True, coding Java isn't really comparable to real-time systems in assembler (something I haven't done, either) but the very valid point stands:
Why wasn't this being tested on real hardware before? Any new internal software version, and certainly any "release candidate" build, should be moved from the developer's machine, to some production test lab, to a real system, expeditiously.
Why wasn't this being tested on real hardware before? Any new internal software version, and certainly any "release candidate" build, should be moved from the developer's machine, to some production test lab, to a real system, expeditiously.
Join Date: Apr 2019
Location: EDSP
Posts: 334
Likes: 0
Received 0 Likes
on
0 Posts
Indeed, as flying would not have been even neccessary. Loading and starting up would have been enough.
Overconfidence in lab setups is by the way another common theme. The Ariane 5 failure was in that category (posted link before, but don't have at hands right now). And it looks like the Starlifterliner failure will be as well.
Overconfidence in lab setups is by the way another common theme. The Ariane 5 failure was in that category (posted link before, but don't have at hands right now). And it looks like the Star
Last edited by BDAttitude; 21st Jan 2020 at 09:07.
Join Date: Nov 2019
Location: Earth
Posts: 7
Likes: 0
Received 0 Likes
on
0 Posts
Hopefully the compiler did the same as Java will be somewhat forceful about handling exceptions and requires you to be somewhat explicit about which exceptions will be thrown.
From an outside POV Boeing appears to have rather lax software development processes (and nearly no QA) in place.
Valiant attempt at gatekeeping aside, there is indeed a formal variant of Java designed for real-time systems (RTSJ). For something developed in the 90s based on older hardware and software I'd expect that Ada was the language of choice. The DoD developed Ada explicitly for real-time safety-critical systems, and that's why Boeing chose it for the 777. But I digress.
Java? Java does a ton of work and hides a lot of details behind a pile of software that won't fit on an FCC and may or may not be busy doing something that you don't want to do when something important needs to be done. Welcome garbage collect.
Want to learn programming - C or assembler or hand code machine language. Java is to programming what MS Flight Simulator is to an F-15 Eagle.
Want to learn programming - C or assembler or hand code machine language. Java is to programming what MS Flight Simulator is to an F-15 Eagle.
By the way, object oriented languages (like Java, C++ etc. ) cannot be used to program controllers because objects need dynamic memory allocation and lots of memory and adress space are simply not available in controllers. But - used with care - there is nothing wrong with programming in C (like the ubiquitous Arduino) or standard Fortran 4 or even an Assembler, provided the specifications and the code are well documented and kept up to date.
object oriented languages (like Java, C++ etc. ) cannot be used to program controllers
Join Date: Nov 2019
Location: Earth
Posts: 7
Likes: 0
Received 0 Likes
on
0 Posts
By the way, object oriented languages (like Java, C++ etc. ) cannot be used to program controllers because objects need dynamic memory allocation and lots of memory and adress space are simply not available in controllers. But - used with care - there is nothing wrong with programming in C (like the ubiquitous Arduino) or standard Fortran 4 or even an Assembler, provided the specifications and the code are well documented and kept up to date.
Rubbish. You can do anything in C++ that you can do in C, and a great deal safer. I speak from much experience. People often confuse C++ with more "automagic" languages like Java and Ruby. C++ has no garbage collection, and if you want to manage memory allocation yourself, including doing everything statically, it's easy.
You absolutely can allocate memory statically in C++ although microcontrollers these days are more capable than desktop processors were thirty years ago. Hell, you can run Java on the 8-bit AVR microcontrollers that made Arduino famous (as well as newer ARM based stuff like the STM32 line).
Ultimately this is likely the problem, and a rather horrifying one at that. That style (and level) of testing wouldn't pass muster at some wanky startup developing a social network for your pet and it ought not be the way that a company selling $50 million flying aluminum tubes conducts business.
Join Date: Mar 2019
Location: French Alps
Posts: 326
Likes: 0
Received 0 Likes
on
0 Posts
Thanks Tobin, BDAttitude and clearedtocross.
Of course nobody suggested any FCC could be programmed in Java. That was an example as to the basics of specifying code is all about managing exceptions and crosschecks. Sorry if it wasn't clear.
Here is a link to some research done in programming a 737 version some years ago.
http://www.cse.cuhk.edu.hk/~lyu/paper_pdf/00005291.pdf
Interesting to see what languages were tried, and what delays it takes for even some experimental version.
C, C++, Pascal and Ada are not uncommon.
And yes it appears the MCAS "fix" has been rushed and not really tested in the real world.
One wonders what would have happened if FAA was still under the influence of Boeing and had returned the airplane to flight...
Of course nobody suggested any FCC could be programmed in Java. That was an example as to the basics of specifying code is all about managing exceptions and crosschecks. Sorry if it wasn't clear.
Here is a link to some research done in programming a 737 version some years ago.
http://www.cse.cuhk.edu.hk/~lyu/paper_pdf/00005291.pdf
Interesting to see what languages were tried, and what delays it takes for even some experimental version.
C, C++, Pascal and Ada are not uncommon.
And yes it appears the MCAS "fix" has been rushed and not really tested in the real world.
One wonders what would have happened if FAA was still under the influence of Boeing and had returned the airplane to flight...
At the outset, I should say I have no experience of coding FCC stuff, so I could be talking through the wrong orifice however:
Reading the above, if the stability of the FCC cannot be assured using C++ Which I know generally to be very stable, could a problem be occurring in the interface between FCC hardware and the code. Is it possible that an asynchronous interrupt from an external sensor/s is not being set consistently by the hardware and when the code goes to look for the bit/s, they are not there?. Of course when you are in a hurry, the last thing to get done is the error reporting and recovery code for a missing interrupt.
IG
Reading the above, if the stability of the FCC cannot be assured using C++ Which I know generally to be very stable, could a problem be occurring in the interface between FCC hardware and the code. Is it possible that an asynchronous interrupt from an external sensor/s is not being set consistently by the hardware and when the code goes to look for the bit/s, they are not there?. Of course when you are in a hurry, the last thing to get done is the error reporting and recovery code for a missing interrupt.
IG
Last edited by Imagegear; 21st Jan 2020 at 10:10.
Join Date: Feb 2015
Location: UK
Posts: 35
Likes: 0
Received 0 Likes
on
0 Posts
Well, yeah. It isn't quite "nothing to see here", but this is exactly the sort of failure you can sometimes get with any software system going from test or staging environments (eng sim) to production - test never quite does things exactly the same. Looks like failure happened at the right place/time (ie. on the ground) and was caught by the existing self-checks.
The surprising thing for me is that this appears to mean they have not yet flown the final fix. So are all those previous test flight useless now? Was this the reason for the test-flight hiatus - ie. not that they'd finished testing (as some said) but that the software wasn't final yet?
The surprising thing for me is that this appears to mean they have not yet flown the final fix. So are all those previous test flight useless now? Was this the reason for the test-flight hiatus - ie. not that they'd finished testing (as some said) but that the software wasn't final yet?
Flight test results which allow system parameters to be optimised will still be valid even if you have to go back and fix some built-in-test code. But the unit test, software bench test, hardware/software integration testing etc would all have to be repeated for the functions affected by any software change.
Join Date: Feb 2013
Location: Gloucestershire, UK
Posts: 34
Likes: 0
Received 0 Likes
on
0 Posts
I feel sorry for the code-writers who are working on this. I mean the ones actually doing the job, trying to understand and solve the problems of the system they need to change, while the rest of the company and its suppliers, thousands and thousands of people mostly with families and mortgages, just stand around twiddling their thumbs and praying that today is the day the damn thing actually works. Just think what that must feel like for the coders, knowing that all those thousands of other people are kind of peering over their shoulders and willing them to stop messing about and just fix this, right now. And I bet the team of coders is tiny. It has to be tiny. The job can't be done by a huge team, it's not that sort of job. Throwing extra resources at it would be nonsense - in fact it would be counter-productive. If I'm right there's this tiny group of very clever people coming in to work each day, to spend another day struggling to dig the whole Boeing company out of the hole it's in, none of them making the sort of money the Boeing board does, and none of them responsible for the design errors which caused the problem in the first place. So I feel sorry for them.
Join Date: Dec 2006
Location: Florida and wherever my laptop is
Posts: 1,350
Likes: 0
Received 0 Likes
on
0 Posts
Thanks Tobin, BDAttitude and clearedtocross.
Of course nobody suggested any FCC could be programmed in Java. That was an example as to the basics of specifying code is all about managing exceptions and crosschecks. Sorry if it wasn't clear.
Here is a link to some research done in programming a 737 version some years ago.
http://www.cse.cuhk.edu.hk/~lyu/paper_pdf/00005291.pdf
Interesting to see what languages were tried, and what delays it takes for even some experimental version.
C, C++, Pascal and Ada are not uncommon.
And yes it appears the MCAS "fix" has been rushed and not really tested in the real world.
One wonders what would have happened if FAA was still under the influence of Boeing and had returned the airplane to flight...
Of course nobody suggested any FCC could be programmed in Java. That was an example as to the basics of specifying code is all about managing exceptions and crosschecks. Sorry if it wasn't clear.
Here is a link to some research done in programming a 737 version some years ago.
http://www.cse.cuhk.edu.hk/~lyu/paper_pdf/00005291.pdf
Interesting to see what languages were tried, and what delays it takes for even some experimental version.
C, C++, Pascal and Ada are not uncommon.
And yes it appears the MCAS "fix" has been rushed and not really tested in the real world.
One wonders what would have happened if FAA was still under the influence of Boeing and had returned the airplane to flight...
I feel sorry for the code-writers who are working on this. I mean the ones actually doing the job, trying to understand and solve the problems of the system they need to change, while the rest of the company and its suppliers, thousands and thousands of people mostly with families and mortgages, just stand around twiddling their thumbs and praying that today is the day the damn thing actually works. Just think what that must feel like for the coders, knowing that all those thousands of other people are kind of peering over their shoulders and willing them to stop messing about and just fix this, right now. And I bet the team of coders is tiny. It has to be tiny. The job can't be done by a huge team, it's not that sort of job. Throwing extra resources at it would be nonsense - in fact it would be counter-productive. If I'm right there's this tiny group of very clever people coming in to work each day, to spend another day struggling to dig the whole Boeing company out of the hole it's in, none of them making the sort of money the Boeing board does, and none of them responsible for the design errors which caused the problem in the first place. So I feel sorry for them.
We know that the MAX still uses 286s, very limited computers dating back to the 1980s. I suspect that the software is written in assembler language rather than some modern high level language, because that allows the limited hardware to be exploited to the maximum.
Unfortunately, such near machine language programming is a bear to write correctly and a beast to debug. So it is not a popular pursuit, especially as computing resources are normally dirt cheap compared to software writers.
If this supposition is correct, the work now falls on the small bunch of surviving veterans left over after the waves of 'efficiency improvements' cut the headcounts. There is no backup Team B available, nor could one create such from scratch.