Go Back  PPRuNe Forums > Flight Deck Forums > Rumours & News
Reload this Page >

Ethiopian airliner down in Africa

Rumours & News Reporting Points that may affect our jobs or lives as professional pilots. Also, items that may be of interest to professional pilots.

Ethiopian airliner down in Africa

Old 31st Mar 2019, 16:26
  #2821 (permalink)  
 
Join Date: Nov 2018
Location: Vancouver
Posts: 68
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by GarageYears


Firstly, as it stands the cause of the second crash is unknown. Fingers pointing at MCAS are speculation, at least until the interim report is published. It may well be a better narrative than other options.

Notwithstanding that, the Ethiopian environment is way different than those that occurred with Lion Air. Check out the MSL altitude of both departure airfields...

Finally, the same AOA sensor is flying in several thousand 737NGs today. It doesn’t seem the sensor is likely to be to blame.

Truth is the Lion Air aircraft shouldn’t have been in service, given the maintenance log and lack of accurate documentation of issues with the aircraft on previous flights. As for Ethiopian we just don’t know any facts, other than the actual crash.

- GY
Well, actually this Ethiopian investigation is almost as leaky as the Indonesian one... The main suspects are very much the same: AOA reading and MCAS.

Since MCAS, in its past iteration, after being fed by erroneous data by a single AOA vane, have a knack to drive the trim mechanism to the end of the jackscrew, essentially doing its job as programmed spectacularly "well", that will leave the AOA vane as the fall guy.

Except..., it is ALMOST IMPOSSIBLE for the vanes which had been in used since forever and thought to have been very reliable would be implicated as the cause for the two crashes within the span of 5 months. This will leave us with something else more plausible as the caused but has been largely ignored: the Max-8 flight control system or something therein...
patplan is offline  
Old 31st Mar 2019, 17:11
  #2822 (permalink)  
 
Join Date: Mar 2018
Location: UK
Posts: 2
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by GordonR_Cape
A third AOA would only have worked if the Boeing 737 MAX had new flight control computers like other models (including Airbus). That was never going to happen, due to the huge cost, certification and training issues. I never implied that 3 AOA sensors have no function, but unless the system architecture can process and vote on them, the third one has no purpose.
I bet Boeing wished that they had spent that extra cash now!
hjd10 is offline  
Old 31st Mar 2019, 17:45
  #2823 (permalink)  
 
Join Date: Aug 2005
Location: EDLB
Posts: 362
Received 4 Likes on 3 Posts
I take bets that it has something to do with the signal wiring form the AoA vane to the flight computer (ADIRU) like shorting out one half of the SIN or COS symmetric signal and creating with that something around a 45 degree/2 offset. If the Ethopian airline FDR does show a similar problem, then there is some latent harness, connector or ADIRU problem which will show up in the other 737 MAX made in a similar timeframe. So if that establishes, the investigation might look into some of the grounded planes build in similar timeframe.
EDLB is offline  
Old 31st Mar 2019, 17:57
  #2824 (permalink)  
 
Join Date: Mar 2019
Location: Usa
Posts: 1
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by Blythy
As an example, on the Space shuttle, there were four identical computers which voted against each other in the case of discrepancy. However, there was a 5th computer (limited to ascent and reentry only) which was different hardware and different software in the event of something which had the same root cause in the software / hardware.
Not entirely true. All 5 computers are AP-101. It had a different subset of functions for ascent and descent, written by Rockwell (IBM was the main contractor for the hardware and flight software). It wasn't a complete rewrite of the flight software. The reason given for not having different hardware was that is would have cost too much. The software itself was an OS written in assembly, and the main code written in HAL/S, possibly on different versions of compiler.

Has there ever really been an aircraft with 2 completely separate hardware and software teams?

Info was gotten from
Computers in Spaceflight: The NASA Experience Chapter 4-3
and The Space Shuttle Primary Computer System, Communications of the ACM September 1984 Volume 27 Issue 9
Daft01 is offline  
Old 31st Mar 2019, 18:28
  #2825 (permalink)  
 
Join Date: Jun 2009
Location: Dorset
Posts: 31
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by safetypee
Whilst all of you Tech ‘bit’ people provide valuable information and possible scenarios, could you please consider why ‘failures’ appear to be very rare and so far only relate to two aircraft / three vanes.

How something fails does not necessarily explain why (when) it failed.
Random, probabilistic, bit count, world clock ?
OK, having considered why the MCAC failures only appear on some flights, possible candidates (the holes in the cheese) that could trigger a software fault in the processing of an AoA correction table (bearing in mind that the ADIRU software was developed only to a “non -safety critical standard”) are:-
1) pin fault: How does an ADIRU recognise it is L or R? Similar sytems I’ve worked on in the past had a fixed pin in the harness connector of one box to designate it as L. If on the problem 737 Max flights the pin became a bad connection, or was bent/missing, the ADIRUs would think they are both L (or both R).
2) IAS fault: As pointed out by patplan in #2799 there was an “IAS & ALT Disagree shown after take off”. If the correction table is indexed by IAS (or has some dependency on it), the software could have used bad IAS data as an index and read garbage data for the correction items.
3) interrupt corruption: A problem that has bitten me hard on a few occasions over the years is with the software that handles interrupts. Typically, interrupt software has to save data held in registers, perform its actions, and then restore the registers to their entry values. A latent problem can exist (just waiting for the right holes to line up) that a piece of software uses another register that the spec writer of the interrupt routine was unaware was being used. Or, more likely, a software update was made (e.g. MCAS date preparation update) that used another register. So after 100s or 1000s of flight hours the holes line up, the interrupt pings off in the middle of the new software, something (could be a data value, a status flag, a jump address or ….) is corrupted. The consequential behaviour could be any of many surprises!
4) processor overload:
Originally Posted by patplan
I have a suspicion that in Boeing 737 Max 8 [B38M] perhaps the LEFT/CAPT ADIRU is constantly being overwhelmed by new routines [i.e. MCAS/AOA related programmings] which may from time to time corrupt the system.
I agree, as the (old tech) processors become more and more loaded, there will reach a point where, given the right circumstance of several things needing to be computed in one cycle, a software routine will not complete. Just as an example, the start up stage will be quiet busy. I would expect the ADIRU to determine its L or R status (perhaps read a pin) and store the result for other software routines to use. So if this action does not complete the ADIRU L or R status will stay at the default value; ADIRUs would both stay as L (or R).
5) flap position: From the preliminary report (Fig. 5 on accident flight & Fig. 7 on previous flight) there is a difference in when the flap position changes. Fig 5 shows a change well after rotation, Fig 7 shows a change at the point of rotation. Could this difference have affected how MCAS subsequently behaved?
VicMel is offline  
Old 31st Mar 2019, 19:40
  #2826 (permalink)  
 
Join Date: Jan 2008
Location: Reading, UK
Posts: 15,788
Received 196 Likes on 90 Posts
Originally Posted by VicMel
So after 100s or 1000s of flight hours the holes line up, the interrupt pings off in the middle of the new software, something (could be a data value, a status flag, a jump address or ….) is corrupted. The consequential behaviour could be any of many surprises!
That sounds like a description of a random failure, rather than something that would manifest itself over several consecutive flights, as was the case with Lion Air.

DaveReidUK is offline  
Old 31st Mar 2019, 19:46
  #2827 (permalink)  
 
Join Date: Aug 2013
Location: Washington.
Age: 73
Posts: 1,071
Received 151 Likes on 53 Posts
Originally Posted by bill fly


Well I don’t agree, for me AoA is an anolog value, which can be related directly to vane angle much more easily on a dial, than yet another strip display.
The AoA display should be designed to support the pilot’s proper and effective use of it (whatever that is). It should be considered, not in isolation, but in the context of the rest of the flight display(s) and the instrument scan which the pilots are expected to conduct. Does any airline which has aircraft equipped with the AoA display have approved pilot procedures for its use? If a check ride was conducted, in which phases of flight would a pilot be faulted for failure to maintain awareness of the AoA display? If one were to evaluate the AoA display design, what measure of performance would be used? Considering the approved Boeing EFIS with the AoA in the upper right corner, how does that fit Basic T flight display philosophy? Of course it doesn’t, because AoA was never part of the Basic T. But if there was a logical, task performance-based purpose for the AoA display, why would it be placed above the altitude display, about as far from the airspeed and attitude indications as it could be? Yet, we suppose it enhances safety?
GlobalNav is offline  
Old 31st Mar 2019, 19:58
  #2828 (permalink)  
 
Join Date: Nov 2018
Location: madrid
Posts: 47
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by EDLB
I take bets that it has something to do with the signal wiring form the AoA vane to the flight computer (ADIRU) like shorting out one half of the SIN or COS symmetric signal and creating with that something around a 45 degree/2 offset. If the Ethopian airline FDR does show a similar problem, then there is some latent harness, connector or ADIRU problem which will show up in the other 737 MAX made in a similar timeframe. So if that establishes, the investigation might look into some of the grounded planes build in similar timeframe.
In theory, (older 737s) each computer takes the three wires (the two analog signals) and amplify them, then demodulates them (turns them into DC) using the reference AC current that powers the vane, then filters them and finally go to an A/D converter. The values are stored at a memory block, and then a software block reads them, and translate them into a AOA (degrees (atan(sin/cos)), which is once more filtered (so two AOA are available, raw and filtered).

I would be really surprised if the software block did not, at that point, perform the plausibility check (sin^2+cos^2=vmax). (for instance, it shows a warning if the vane didn't move more than 3 degrees for a period of time).

But even if it didn't, shorting one of the signals to vref would produce 9 degrees offset at the vane. Does that translate to 22 degrees airplane AOA?
ecto1 is offline  
Old 31st Mar 2019, 21:01
  #2829 (permalink)  
 
Join Date: Feb 2017
Location: Adelaide
Posts: 3
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by patplan
Well, actually this Ethiopian investigation is almost as leaky as the Indonesian one... The main suspects are very much the same: AOA reading and MCAS.

Since MCAS, in its past iteration, after being fed by erroneous data by a single AOA vane, have a knack to drive the trim mechanism to the end of the jackscrew, essentially doing its job as programmed spectacularly "well", that will leave the AOA vane as the fall guy.

Except..., it is ALMOST IMPOSSIBLE for the vanes which had been in used since forever and thought to have been very reliable would be implicated as the cause for the two crashes within the span of 5 months. This will leave us with something else more plausible as the caused but has been largely ignored: the Max-8 flight control system or something therein...
Except the AOA is not hooked up to MCAS on the NG
Wilderone is offline  
Old 31st Mar 2019, 21:32
  #2830 (permalink)  
 
Join Date: Dec 2002
Location: UK
Posts: 2,451
Likes: 0
Received 9 Likes on 5 Posts
VicMel, #2858, thanks for the reply.

So with my simplistic view the problem appears to be random, chance. Alternatively, as a sceptic, why 2 aircraft in 4 months, whereas the remaining fleet …
OK, so this is the nature of probability, together with the ever-increasing fleet size.

Thus the next question is where a ‘good’ AoA software fix could - should be made, but if not … at least the output of MCAS should be limited.
And if AoA is not fixed (still probability), there could still be problems with speed pressure error correction, air-data disagree, feel, and low speed awareness, but will these events be no more than experience in previous 737s, or if it still is an issue with the Max (inadequate software / FGC ADIRU overloaded) then there will be an increase in disagree alerts due to ‘corrupt’ AoA.

I may be dancing around the same tree as in my post in the other thread - #485 Boeing 737 Max Software Fixes Due to Lion Air Crash Delayed
Where is the value of AoA sampled by the FDR; would this clarify current understanding.



safetypee is offline  
Old 31st Mar 2019, 22:00
  #2831 (permalink)  
 
Join Date: Jul 2007
Location: the City by the Bay
Posts: 547
Likes: 0
Received 0 Likes on 0 Posts
Is the Boeing 737 MAX Worth Saving?

can Boeing save it?
armchairpilot94116 is offline  
Old 31st Mar 2019, 22:41
  #2832 (permalink)  
fdr
 
Join Date: Jun 2001
Location: 3rd Rock, #29B
Posts: 2,944
Received 847 Likes on 251 Posts
Originally Posted by GordonR_Cape
It is deeply ironic that the issue MCAS was designed to cater for was never flight critical, and might never have occurred during the lifetime of the aircraft. Instead the fix ended up killing hundreds of people.
This highlights the underlying issue that the industry has processes that can bite back. To achieve compliance with a particular rule a simple fix is implemented, and that has a potential for unintended consequences. The failure mode of the compliance fix has it's own unknown interaction with the operating system at the man machine interface; somewhere along the way recognition failed as to the underlying cause, for a crew that had never heard of the "fix" and to another crew that had learnt of the problem due to the revelations of the first crews misfortune. In fact, the knowledge gained in the flight preceding JT610 was lost on the next crew as well, the system doesn't allow for the timely transfer of information, and it probably cannot do so under any process that validates the information and the output to avoid errant information being introduced.

The constant offset once in motion would appear inconsistent with a loss of the sin or cos output alone as far as I understand the use of those functions to derive the A-D output state. Contend as previously commented that the sensor itself is unlikely to be the component that has the fault, which leads to the install, wiring, or processing of the signal as being the point of failure. The loss of a single resolved output is intriguing, giving an erroneous result but it would appear that the offset error would alter with the change of actual AOA. The aircraft was operated from low speed, through to high speed, with substantial change in actual AOA, but the offset appears to be constant.
fdr is offline  
Old 31st Mar 2019, 23:04
  #2833 (permalink)  
 
Join Date: Feb 2011
Location: Leeds
Posts: 1
Likes: 0
Received 0 Likes on 0 Posts
I think the publicity that the max has generated over the last few weeks especially since the grounding, could kill the airframe off... there may be no way back for it. The public are very powerful and could refuse to fly on it even after Boeing has "updated IOS" or whatever they are doing, I personally wont be going near one with my family even after its "fixed" and i fly for a living, so joe public will be even more cautious.

I honesly think its crazy that we are now in a situation where a plane is crashing and we are saying we need a software update... its gone too far.

People except humans make errors and pilots sometimes screw up, what they wont accept is software in charge of their lives
Livesinafield is offline  
Old 31st Mar 2019, 23:37
  #2834 (permalink)  
 
Join Date: Mar 2019
Location: Bavaria
Posts: 20
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by fdr
The constant offset once in motion would appear inconsistent with a loss of the sin or cos output alone as far as I understand the use of those functions to derive the A-D output state. Contend as previously commented that the sensor itself is unlikely to be the component that has the fault, which leads to the install, wiring, or processing of the signal as being the point of failure. The loss of a single resolved output is intriguing, giving an erroneous result but it would appear that the offset error would alter with the change of actual AOA. The aircraft was operated from low speed, through to high speed, with substantial change in actual AOA, but the offset appears to be constant.
There are not many failure modes which may cause such a constant deviation. If normal checks are in place, it rules out everything except wrong calculation within or after atan(sin/cos). Cabling, loss of ground, ADC error... there are checks for such problems and they do not cause a constant offset.
Again, there are 2 possibilities left if I didn't miss something (I'm evaluating such resolvers for 2 safety relevant systems within electric cars at the moment):
-> Electromagnetic interference (EMI) at exactly the frequency the sensor is working on (or the sensor locks on the interference frequency with it's resonator) I tried to find a correlation between engine running/rpm and sensor failure but could not find any. EMI from the new engines would have been a nice one.
-> Error within the calculation after sin/cos and plausibilisation (sin˛+cos˛=1) -> Software change / bug?

The sin and cos voltage is simply the x and y of a 2D unit vector (look for 'unit vector' on english wikipedia, I'm not allowed to post a link) The receiver simply checks if the unit vector has a length of 1. If a cable breaks or the ADC has an error, it won't.
If there is no need to measure 360°, the electrical full circle is often a fraction of the mechanical one. So the electrical vector would make 2/3/4 turns on one mechanical revolution of the fin. Therefore 22.5° deviation could come from 90° signal error or calculation error. Those 22° somehow smell like some 90° computational error (e.g. wrong sign). Especially since atan calculations in old software only have a table for one quadrant of the unit vector and then switch signs or add/subtract 90°/180°/270°.
Switching cables (sin/cos) would btw. invert the angle (90°-x).

Still: If this sensor design is so bad, why is it still the same for the last decades? What changed on the MAX which tampered the probability of this error that much?

Without an answer to this question I would not trust the AoA signals (and many other) at all! (...and I'm a functional safety consultant)

This SW fix tries to fix the impact of the failure, the root cause seems to be still unknown. But without knowing the cause, other side effects cannot be identified.
TryingToLearn is offline  
Old 1st Apr 2019, 00:11
  #2835 (permalink)  
 
Join Date: Mar 2005
Location: N/A
Posts: 5,882
Received 362 Likes on 192 Posts
I honesly think its crazy that we are now in a situation where a plane is crashing and we are saying we need a software update... its gone too far
This is not a first, software code has been responsible for prior accidents, Iberia A320 being one.
The design of the flight control system was such that the actions of both pilots over the flight controls were ignored by the logic of the control system and prevented the aircraft from flaring.

The cause of the accident was the activation of the angle of attack protection system which, under a particular combination of vertical gusts and windshear and the simultaneous actions of both crew members on the sidesticks, not considered in the design, prevented the aeroplane from pitching up and flaring during the landing.
Fixed by a code modification.

http://www.fomento.es/NR/rdonlyres/8...006_A_ENG1.pdf
megan is offline  
Old 1st Apr 2019, 00:59
  #2836 (permalink)  
 
Join Date: Mar 2015
Location: North by Northwest
Posts: 476
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by megan
This is not a first, software code has been responsible for prior accidents, Iberia A320 being one.
Fixed by a code modification.
Many years ago while working on a fire-control system, we were evaluating test methodologies between the F-16's Westinghouse, General Dynamics Phalanx fire-control, and Airbus fly-by-wire. The Airbus strategy (as I recall which was about 4 decades back) was to deliver the code to 3 companies in three different countries, none of whom knew of the others existence. AB expected each would find some unique code exceptions by doing so. Not so. Well over 90% were identifed by multiple vendors including all deemed critical bugs save maybe one. The rest were not considered major flight control errors.

Maybe Gums could chime in here, but we had heard rumors (maybe urban legend) that some of the early F-16 deployments in Germany with look-down did on occasion lock on to low flying Mercedes on the Autobahn. As a designer, how many would consider that possibility?

There will never be a perfect balance between automation and human interaction. Automation is programmed by humans - mistakes will happen on both ends.

Last edited by b1lanc; 1st Apr 2019 at 11:26.
b1lanc is offline  
Old 1st Apr 2019, 01:48
  #2837 (permalink)  
 
Join Date: Jul 2014
Location: Harbour Master Place
Posts: 662
Likes: 0
Received 0 Likes on 0 Posts
I'm not a software person, however, I have been interested in the automation + human factors since a Computer Science friend put me on a lead in the early 1990's with the Therac-25 accidents, this lead to reading more by Nancy Leveson: High-Pressure Steam Engines and Computer Software. This is a great introduction to the larger picture of the interaction between sophisticated hardware racing well beyond the much slower and risky software engineering in historical context with the engineering of steam engine vs dangerous and lagging boiler tech. Public pressure forced the formation of safety laws to protect the end users from dangerously engineered devices. Although written in 1992, I believe it still has many insights that make it relevant. She also has written much on safe software development techniques.

Her homepage: Nancy Leveson Professor of Aeronautics and Astronautics
CurtainTwitcher is offline  
Old 1st Apr 2019, 05:32
  #2838 (permalink)  
 
Join Date: Jul 2007
Location: the City by the Bay
Posts: 547
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by Livesinafield
I think the publicity that the max has generated over the last few weeks especially since the grounding, could kill the airframe off... there may be no way back for it. The public are very powerful and could refuse to fly on it even after Boeing has "updated IOS" or whatever they are doing, I personally wont be going near one with my family even after its "fixed" and i fly for a living, so joe public will be even more cautious.

I honesly think its crazy that we are now in a situation where a plane is crashing and we are saying we need a software update... its gone too far.

People except humans make errors and pilots sometimes screw up, what they wont accept is software in charge of their lives
The Max is all in for Boeing. Fastest selling Boeing jet, yada yada. Boeing is going to do whatever it takes to get that bird back in the air. Too much at stake. The only way for the airframe to end production is if the majority of the orders evaporate overnight. This probably won't happen. So if all goes well and the software patch works and airlines don't cancel orders and the general public goes back to flying it . All will be well. But Boeing will be wise to learn that it is time to get cracking at an all new 737 based loosely on the 757 perhaps. Or better, basically copy the A320. And to amortize the costs of the Max quickly so the line can end soon as the new one is ready. When will it be? Ten years? Now if the Max has another accident within the next few years, no matter the cause, that may be it. Boeing can't afford another Max going down.
armchairpilot94116 is offline  
Old 1st Apr 2019, 06:33
  #2839 (permalink)  
 
Join Date: Feb 2015
Location: The woods
Posts: 5
Likes: 0
Received 3 Likes on 2 Posts
Originally Posted by GlobalNav


The AoA display should be designed to support the pilot’s proper and effective use of it (whatever that is). It should be considered, not in isolation, but in the context of the rest of the flight display(s) and the instrument scan which the pilots are expected to conduct. Does any airline which has aircraft equipped with the AoA display have approved pilot procedures for its use? If a check ride was conducted, in which phases of flight would a pilot be faulted for failure to maintain awareness of the AoA display? If one were to evaluate the AoA display design, what measure of performance would be used? Considering the approved Boeing EFIS with the AoA in the upper right corner, how does that fit Basic T flight display philosophy? Of course it doesn’t, because AoA was never part of the Basic T. But if there was a logical, task performance-based purpose for the AoA display, why would it be placed above the altitude display, about as far from the airspeed and attitude indications as it could be? Yet, we suppose it enhances safety?
Hi Nav,
The purpose of the AoA indicator on the Max is not to read as an additional flight instrument.
It is a position indicator.
Therefore there is no requirement for it to be included in the scan etc.
If you get a disagree warning you can tell quickly which signal is the troublemaker.
To me it makes sense and yes, it is a safety feature.
If, together with the mod on MCAS travel and necessary information to converting crews, it had been incorporated from the beginning, then it would have given a valuable clue and could have prevented these tragic events.
I am still not a fan of the MCAS as a solution to the control force requirement. That doesn’t make me criticise every move Boeing makes, however, when they try to learn from their mistake.
bill fly is offline  
Old 1st Apr 2019, 07:01
  #2840 (permalink)  
 
Join Date: Aug 2000
Posts: 1,499
Likes: 0
Received 0 Likes on 0 Posts
This could very well be related to something else than just the AOA sensors.
What is different in the MAX AOA system compared to the NG? The sensors are the same.
I have never seen an AOA disagree caution on the NG, so why is does it fail on the MAX?
The sensor may have been installed wrongly on the Lion Air MAX, but that was not the case on the Etiopian MAX.
ManaAdaSystem is offline  

Thread Tools
Search this Thread

Contact Us - Archive - Advertising - Cookie Policy - Privacy Statement - Terms of Service

Copyright © 2024 MH Sub I, LLC dba Internet Brands. All rights reserved. Use of this site indicates your consent to the Terms of Use.