Go Back  PPRuNe Forums > Flight Deck Forums > Rumours & News
Reload this Page >

Ethiopian airliner down in Africa

Rumours & News Reporting Points that may affect our jobs or lives as professional pilots. Also, items that may be of interest to professional pilots.

Ethiopian airliner down in Africa

Old 20th Apr 2019, 04:01
  #4161 (permalink)  
 
Join Date: Feb 2019
Location: shiny side up
Posts: 363
So far nobody asked Boeing how something so obvious and big could slip thru their safety process including document reviews, walk-thru, inspections, accessments and linked documents on several layers of detail.
Boeing must have known something was up if the computer modelling showed that MCAS needed to provide 0.6 degrees correction, and in flight testing, 2.5 degrees was required. That is a big disconnect between design assumptions and actual.

What went wrong within the engineering process and how can you prove that no other hazards excaped thru the exact same hole in your process.
There are 4 interfaces for the Horizontal Stabilizer portion of the Flight Control System (FCS), which link the autopilot trim, electric trim, manual trim, and MCAS trim to the stabilizer trim system.

EDIT: Just read this...
Boeing is currently examining whether or not the current MCAS interface between the MCAS computers and the horizontal stabilizer trim motors and Horizontal Stabilizer Jackscrew is compatible with the MCAS software updates.

As MCAS was an option for other 737 variants, (as well as other Boeing aircraft) it would be interesting to see what those systems provided as a correction. (perhaps to the 738?)

On a historical note, it appears that MCAS has been considered for most Boeing aircraft, but that the vortex tabs solved the problem....now we know why they are still on all of the wings.

Last edited by Smythe; 20th Apr 2019 at 04:30.
Smythe is offline  
Old 20th Apr 2019, 04:57
  #4162 (permalink)  
 
Join Date: Jul 2011
Location: Canada
Posts: 55
Originally Posted by MurphyWasRight View Post
My point was more about the nature of the emergency AD that could have been much clearer about uncommanded trim that stops with any pilot trim then restarts about 5 seconds later.
In that case stress importance of first fully trimming then hitting cutout.
This is hinted at in a note at the end of the procedure, I say it should have been highlighted.

Unlike 'stuck switch" uncommanded trim the MCAS case does allow for pilot electrical trim first then followed by cutout, a quick blip of the switch is all that would be needed to test if this was possible. That could be a step in the runaway trim procedure, I agree do not need a seperate MCAS checklist if that was included.
I think that’s a good observation. Sorry for misinterpreting your original intent.
L39 Guy is offline  
Old 20th Apr 2019, 05:44
  #4163 (permalink)  
 
Join Date: Aug 2007
Location: Alabama
Age: 54
Posts: 343
Originally Posted by L39 Guy View Post
The Runaway Stabilizer checklist states "Condition: Uncommanded stabilizer trim movement occurs continuously". I think we can agree that an MCAS event is "uncommanded".
What does continuous mean? 5 seconds, 10 seconds, 1 minute, one hour? And let's say that the uncommanded stabilizer trim movement is caused by an intermittent short circuit somewhere (trim switch, wiring harness, etc) that produces a 5 second, 10 second, 1 minute uncommanded stabilizer trim movement? How can you tell the difference? Do you really care what the source is? Doesa) the source affect the outcome? Do you really think that someone flying along, fat, dumb and happy and suddenly has the nose pitch down will have the presence of mind to count how long the trim is moving? I would suggest the shock value would preclude that.

As well, one does not want to get into the game of diagnosing the source of the failure (MCAS or otherwise) while the control of the aircraft is at stake. One checklist to cover all scenarios is more than adequate. Secure the malfunction, fly the airplane and land asap and save the diagnosis and troubleshooting once you're back on terra firma.
other than MCAS can be stopped by a simple use of the thumb switch on the yoke, MCAS will not be keep going runaway, will just pause for 5 seconds and than restart.
MCAS has also a much different logic, including new cutoff logic than previous versions, a runaway trim can be in both directions, while MCAS is only in nose down, and can be stopped without cutout...to me as non pilot does not look like a memory item
FrequentSLF is offline  
Old 20th Apr 2019, 06:22
  #4164 (permalink)  
 
Join Date: Jan 2008
Location: australia
Posts: 101
TryingtoLearn #4154 rightly asks “What went wrong within the engineering process and how can you prove that no other hazards excaped thru the exact same hole in your process.”
787 was first aircraft certificated (partly) under new ''Organisation Designation Authorisation'' (ODA) arrangements, specifically intended to reduce FAA involvement. NTSB Report 2014/AIR1401 tells us what went wrong and how hazardous batteries slipped thru.
“Boeing’s electrical power system safety assessment did not consider the most severe effects of a cell internal short circuit and include requirements to mitigate related risks, and the review of the assessment by Boeing authorized representatives and Federal Aviation Administration certification engineers did not reveal this deficiency"
.”
Boeing failed to incorporate design requirements in the 787 main and auxiliary power unit battery specification control drawing to mitigate the most severe effects of a cell internal short circuit, and the Federal Aviation Administration failed to uncover this design vulnerability as part of its review and approval of Boeing’s electrical power system certification plan and proposed methods of compliance".
"
Unclear traceability among the individual special conditions, safety assessment assumptions and rationale, requirements, and proposed methods of compliance for the 787 main and auxiliary power unit battery likely contributed to the Federal Aviation Administration’s failure to identify the need for a thermal runaway certification test.”
787 battery fires could easily have cost two planes and all on board. Boeing/FAA Corp. failed to learn. 737 Max is second Boeing certificated under ODA.

Last edited by ozaub; 20th Apr 2019 at 06:28. Reason: formatting went awry
ozaub is offline  
Old 20th Apr 2019, 06:33
  #4165 (permalink)  
 
Join Date: Jan 2008
Location: Reading, UK
Posts: 10,277
Originally Posted by Smythe View Post
As MCAS was an option for other 737 variants
Can you expand on that? Have any (non-Max) 737 operators opted for MCAS on their aircraft? Why would they need it?

DaveReidUK is online now  
Old 20th Apr 2019, 11:21
  #4166 (permalink)  
 
Join Date: Sep 2017
Location: Europe
Posts: 1,248
Originally Posted by TryingToLearn View Post
OK, first: I'm not a pilot, I'm a functional safety engineer, mostly working for automotive.
Second: I read this thread from the beginning and learned a lot, thanks!

But I think I can explain one tendency which went up:
Pilots blame the pilots, engineers blame the Boeing engineers.

From my point of view, the reason in both cases is the same:
Pilots know the processes and trained procedures for pilots and learned, that the crews didn't follow them completely and textbook-like but rather improvised. But they do not know the engineering process regarding safety-critical systems/hardware/software.
With the engineers it is exactly otherwise. They see a crew overwhelmed by alarms, shakers and informations caused by an engineering error. For them (sorry) the pilot is the last line of defense in case they did not do their job of everything goes wrong (multiple point fault).

Pilots follow procedures which e.g. minimize the risk to take off with a wrong configuration, They double-check and check again and have proven-in-use procedures which make sure that such things happen less than one in a million flights.
Engineers know proven-in-use processes which make sure that something like the current MCAS system effectively never happens.
Still it happened.

Boeing knows why they put all focus on how great they fix MCAS because if someone asks the right question, they are in much deeper trouble like, for example Volkswagen:
The big punishment for them was not fixing the cars but they had to implement a process that makes sure that this never happens again.

So far nobody asked Boeing how something so obvious and big could slip thru their safety process including document reviews, walk-thru, inspections, accessments and linked documents on several layers of detail. And, in addition, how this would not be found in all the classic safety/quality analysis methods (FMEDA, FMEA, FTA, DFA...).
Safety is not based on the genius of the one great programmer who is also a pilot and simulates every thin in his head (but makes a mistake after having too much pizza) but rather a strict process including a lot of people and a lot of documentation and testing.

Within this thread, pilots question the training and qualification of mainly all pilots regarding critical situations. But they are the last line of defense.
Following the same logic, one could question the qualification, independence and culture of Boeing safety engineers.
And yes, that would lead to the question if there are other functions like MCAS still hidden...

Maybe the pilots may have been able to safe a few lives, but the biggest mistakes happened years before driven by
-> Strange laws (Grandfather rights)
-> Commercial interest (no training)
-> inconsistent requirements / documentation (0,6 within risk analysis and 2.5 within SW)
-> Maybe bad safety culture if this was done on purpose and not by mistake
-> Mistakes within the impact analysis of a wrong MCAS activation

If I would be a member of the FAA or similar organization, I would not focus on MCAS and the bugfix, I would simply aks: What went wrong within the engineering process and how can you prove that no other hazards excaped thru the exact same hole in your process.
The deviation from established engineering rocesses I assume here in my opinion far exceeds the deviation between the trim runaway procedure and what actually happened.

But as mentioned: I'm not a pilot.
Normailisation of Deviance.

It happens an increment at a time.
Rated De is offline  
Old 20th Apr 2019, 12:01
  #4167 (permalink)  
 
Join Date: May 2008
Location: denmark
Posts: 26
"Maybe the pilots may have been able to safe a few lives, but the biggest mistakes happened years before driven by
-> Strange laws (Grandfather rights) "

Within the machine industry there is no 'Grandfather rights', not in the European Machinery Directive, and US NFPA
https://www.robotics.org/content-det...ontent_id/6622
HighWind is offline  
Old 20th Apr 2019, 12:40
  #4168 (permalink)  
 
Join Date: Apr 2019
Location: UK
Posts: 1
Been following most of this thread with interest since it stared, but appologise if this has already been posted. I can't post the URL as I haven't posted 10 times , but its a page from Spectrum.ieee org by a software developer and pilot, and looks at the 737 Max disaster from a software developers point of view.


This bit in particular sums it all up for me, whereby the author compares the installation of fitting an Autopilot to his Cessna 172 and the certification of MCAS on the Max 8

"As you can see, the similarities between my US $20,000 autopilot and the multimillion-dollar autopilot in every 737 are direct, tangible, and relevant. What, then, are the differences?
For starters, the installation of my autopilot required paperwork in the form of a “Supplemental Type Certificate,” or STC. It means that the autopilot manufacturer and the FAA both agreed that my 1979 Cessna 172 with its (Garmin) autopilot was so significantly different from what the airplane was when it rolled off the assembly line that it was no longer the same Cessna 172. It was a different aircraft altogether.
In addition to now carrying a new (supplemental) aircraft-type certificate (and certification), my 172 required a very large amount of new paperwork to be carried in the plane, in the form of revisions and addenda to the aircraft operating manual. As you can guess, most of those addenda revolved around the autopilot system.
Of particular note in that documentation, which must be studied and understood by anyone who flies the plane, are various explanations of the autopilot system, including its command of the trim control system and its envelope protections.
There are instructions on how to detect when the system malfunctions and how to disable the system, immediately. Disabling the system means pulling the autopilot circuit breaker; instructions on how to do that are strewn throughout the documentation, repeatedly. Every pilot who flies my plane becomes intimately aware that it is not the same as any other 172.

This is a big difference between what pilots who want to fly my plane are told and what pilots stepping into a 737 Max are (or were) told.

Another difference is between the autopilots in my system and that in the 737 Max. All of the CAN bus–interconnected components constantly do the kind of instrument cross-check that human pilots do and that, apparently, the MCAS system in the 737 Max does not. For example, the autopilot itself has a self-contained attitude platform that checks the attitude information coming from the G5 flight computers. If there is a disagreement, the system simply goes off-line and alerts the pilot that she is now flying manually. It doesn’t point the airplane’s nose at the ground, thinking it’s about to stall.

Perhaps the biggest difference is in the amount of physical force it takes for the pilot to override the computers in the two planes. In my 172, there are still cables linking the controls to the flying surfaces. The computer has to press on the same things that I have to press on—and its strength is nowhere near as great as mine. So even if, say, the computer thought that my plane was about to stall when it wasn’t, I can easily overcome the computer.

In my Cessna, humans still win a battle of the wills every time. That used to be a design philosophy of every Boeing aircraft, as well, and one they used against their archrival Airbus, which had a different philosophy. But it seems that with the 737 Max, Boeing has changed philosophies about human/machine interaction as quietly as they’ve changed their aircraft operating manuals."
Nigel Tufnel is offline  
Old 20th Apr 2019, 14:07
  #4169 (permalink)  
 
Join Date: Feb 2019
Location: shiny side up
Posts: 363
Quote: As MCAS was an option for other 737 variants
Can you expand on that? Have any (non-Max) 737 operators opted for MCAS on their aircraft? Why would they need it?
In reading through the volumes of data, on AoA...I did read where it was either offered or going to be offered on the entire 737 line. I dont know if and when it was offered. The issue of the vortex tabs was included in this explanation. It may be when they were trying to get rid of the tail tabs, but unclear when it stated vortex tabs, which ones they were talking about.
Perhaps they were just going to offer MCAS for stall protection, I dont know. Now that the internet is inundated with information on this, it will have to dig to find this.

MCAS is standard on the 767 tanker, but is different..this seems to be a much better system , 2 sensors, and disco on pilot input.
The KC-46 uses a similar system because the weight and balance of the tanker shifts as it redistributes and offloads fuel. The KC-46 has a two-sensor MCAS system, which “compares the two readings,” the Air Force said.
Moreover, while the MAX 8 MCAS will reset and come back on automatically, the KC-46’s system is “disengaged if the pilot makes a stick input,” according to the Air Force. “The KC-46 has protections that ensure pilot manual inputs have override priority.”
Smythe is offline  
Old 20th Apr 2019, 14:20
  #4170 (permalink)  
 
Join Date: Dec 2018
Location: 8th floor
Posts: 82
Originally Posted by TryingToLearn View Post
OK, first: I'm not a pilot, I'm a functional safety engineer, mostly working for automotive.
Second: I read this thread from the beginning and learned a lot, thanks!

But I think I can explain one tendency which went up:
Pilots blame the pilots, engineers blame the Boeing engineers.
Good point. I also read this thread from the beginning and, to add another data point to the tendency you noticed, I'm a software engineer and I tend to blame Boeing more than the pilots. Boeing's attitude after the Lion Air accident contributed to that. If they didn't try to downplay the gravity of experiencing an incorrect MCAS activation, I would have probably been more sympathetic towards them.

Just like there seems to be a deficit of pilots in the aviation world, I think that generally there is a deficit of good software developers, and it's getting worse. I think the quality of software took a nosedive during the last decade. Software from a decade ago was way more polished than what I see today, and this is very frustrating.

Sure, just as the safety of air travel is getting better and better, a lot of lessons have been learned from the past in the software industry, and some types of bugs and quality issues are becoming less and less frequent. But there seems to be a lot less attention to detail, and I find it unbelievable that large software companies repeatedly release software products with significant bugs, that should are obvious to anyone after only a few minutes of using the product.

And I don't think it's just the deficit of good software developers causing this. I see a variety of other factors contributing to the decline in the quality of software, for example a tendency to spend less on quality assurance, and relying more and more on the end users to find and report quality issues with the software products. I hoped this trend would affect mostly regular consumer products, and not software that is critical for safety, but unfortunately that doesn't seem to be the case, some recent examples being the Tesla self driving car software, and possibly MCAS.

Anyway, back to the MCAS topic, I watched Mentour's recent video about being unable to trim manually at high speeds when the aircraft is severely out of trim. One thing that surprised me is that the simulator, which Mentour described as: "this is a level D FFS. That’s as real as it gets", is not able to replicate a stabilizer runaway similar to an incorrect MCAS activation, the runaway stabilizer failure is not able to bring the trim to full AND.

I guess the reason the simulated failure is not able to apply more AND trim is that it simulates something similar to stuck yoke trim switches. In such a situation, after reaching about 4 units with the flaps retracted, the trim limit switches would activate. That would prevent the trim from going lower than 4 units. I think that's why they have to trim manually the wrong way in the video, to try to simulate a worse mistrim, similar to that experienced by the Lion Air and Ethiopian pilots, because the simulator doesn't seem to be able do that.

If that's the case, I'm even more annoyed by Boeing's initial response that the pilots they should have just applied the runway stabilizer procedure. If the simulators are not able to replicate a mistrim as severe as one caused by a malfunctioning MCAS, clearly the existing simulator training for a stabilizer runaway failure is not entirely adequate for dealing with an MCAS induced trim runaway.
MemberBerry is online now  
Old 20th Apr 2019, 16:25
  #4171 (permalink)  
 
Join Date: Apr 2019
Location: East Coast
Posts: 3
My Appreciation to You for Speaking

Originally Posted by L39 Guy View Post
I think your analysis and continuum of blaming the pilots/Boeing and everyone else harmless to blame Boeing and everyone else/pilots harmless is a nice summary.

I will tell you where I sit on this, based upon 36 years of professional flying (31 airline, 5 military), type rated on B737-200/767/777/787 and Airbus A330/A340, and a Professional Engineer.

First, aircraft are amazing and complex machines. Aircraft do what man was never intended to do naturally so there is an inherent risk in that alone. But even the best designed and best maintained aircraft have components that break and the human onboard is the last line of defense in many of those instances. Hydraulic systems, electrical systems, pressurization systems, propulsion systems (engines) all fail and often there is not double or triple redundancy due to weight issues, cost issues, probability of failure and the impact of a failure. As an example, there is no level of redundancy that will offset an engine failure no matter how many engines the aircraft has; the adverse yaw, the reduced performance, etc cannot be compensated for by having more engines, although the more engines an aircraft has, the less effect a single engine failure has. Aircraft manufacturers, at the urging of their airline customers, like two engine aircraft for the economics (one engine versus two, reduced fuel, reduced weight, etc). So we accept that fact that an engine failure on a two engine aircraft will have big affect when one fails, more than a single engine failure on an eight engine aircraft.

So when an engine (or hydraulic, or electric, or pressurization) system fails a pilot is in the loop to manage the situation. That is why we are highly trained and, hopefully, well compensated financially for that knowledge and judgement. The level of training lies with the individual knowing their stuff including their emergencies, particularly emergencies that are memory items - that is a personal responsibility of any professional pilot. The airline and country CAA is responsible too for insuring that the pilots charged with responsibility for that aircraft and those lives in the aircraft are also trained properly initially and on a recurrent basis.

Where do I sit on your continuum? Somewhere between 2 and 3. MCAS is a required stall protection however it needs to be toned down a bit as it can move the stabilizer trim to very large aircraft nose down angles. Should pilots be aware of MCAS in their technical training on the MAX; sure, but it would not have affected the outcome regardless as, I will describe shortly, MCAS failure presents characteristics identical to a stabilizer trim runaway, an emergency checklist item that has been around since the original B737 fifty years ago.

I would expect that any pilot with a type rating for a Boeing 737, MAX included, should be able to identify an Unreliable Airspeed (UAS). That is basic stuff, stick shaker when the aircraft if flying normally, disparity between the indicated airspeeds, etc. Any professional pilot should be able to recognize this regardless of what aircraft they are flying as every aircraft in the world is subject to this problem.

Only 2 of the 3 non-US MCAS incidents saw the pilots recognize the UAS and do something about it. This was long before MCAS reared its ugly head so that begs the question: Why? Training and experience would be my answer. And, as it turned out, the crew that did execute the UAS drill were the ones that ultimately saved the aircraft. Bear in mind that UAS is a memory drill. To me, there is no excuse why this was not done as it was a textbook UAS; to me also, in the Ethiopian case, engaging the autopilot at 400 ft is a definite faux pas as it is contrary to the memory UAS drill and, if the aircraft was indeed stalled, another definite faux pas as one does not recover from a stall with the autopilot. This points to a training/experience issue too but it also points to an over reliance on the use of the autopilot at the expense of hand flying an airplane. This too is likely an airline/CAA issue as well as an individual pilot issue. And, not to imply that this is a "third world" country issue as I am seeing this more and more with the FO's I fly with; they are terrified to hand fly the aircraft and hence their hand flying skills begin to decline.

When all of these crews experienced MCAS, 0 of 3 were able to recognize a classic stab trim runaway; while manually flying the aircraft, the nose pitches down all by itself. You simply can't miss it, with or without seeing the stab trim wheel moving. In the Lion Air case where the aircraft was saved, it took a third pilot from a different airline to tell the crew what to do; in the case of the fatal Lion Air accident, the Captain handed control to the First Officer so he could go hunting through the checklist for something to do - this is a memory drill! I fault the individual pilots for not knowing their memory emergencies - harsh as that may sound, that is what we are paid for. (Personally, I do a study of the memory emergencies regularly as the memory isn't what it used to be and I like to think that most professionals pilots do the same).

I am not going to rehash the rest of the issues (flying to destination with this problem, not controlling the aircraft speed in the Ethiopian case, etc) as that would be covering old ground but there are lots of basic airmanship issues that are highly questionable.

In conclusion, all of these MCAS issues were recoverable situations by a trained and competent crew. I do not blame the individuals entirely as the airline, their training, their hand flying policies as well as the CAA overseeing them deserve scrutiny too however. Let me repeat that: These MCAS accidents were all recoverable. Boeing's mistake, in addition to what I noted earlier, is assuming that B737 type rated pilots would be able to do even the most basic, memory emergency drills. This, however, begs the question: If pilots cannot even do a simple UAS emergency, what hope do they have for a more complex one such as an engine failure at rotation or an engine failure followed by a cabin depressurization which that Southwest crew handled masterfully?

What this MCAS situation points to, in my estimation, is that the aviation industry has been "whistling past the graveyard" for a little too long and that the underlying problems quickly manifest themselves with even the slightest irregularity - the Turkish B737 accident at Schipol, the Korean 777 in SFO, etc.. A complete rethink of pilot training, basic flying skills and airmanship are in order as there is a finite limit to what Boeing, Airbus, Embraer or any other aircraft manufacture can do to design and build airplanes without human intervention being required a certain times by competent aviators.
Hello all:
Today I registered for this board although I probably will never post here again. I have lurked for years as I find the topics here of great interest. First, I am not a pilot but a doctorate level engineer who has spent close to 40 years in aerospace design and engineering in the United States.
I am posting today to express my deep appreciation to L39 for his insightful, articulate, and erudite post. In my profession and leisure life I am a (very) frequent flyer. Many of the locations my wife and I travel require one or more 737 flights. I have been following the terrible Max accidents from the initial Lion Air loss, as many of my business and leisure legs have been assigned to 737 Max aircraft. After the second loss I was prepared to find other means rather than board a Max flight, but of course the grounding eliminated the need. Reading every Max related post on this board I must say that I was not encouraged by the tone of many, but not all, of the posts. There seemed to be a widespread feeling expressed among many of the professional pilots that seemed to me as rejection of any thought that the unfortunate crews had any blame in the losses, and that Boeing was to blame completely. The reason that disturbed me so much was that we in the traveling public depend totally on the skill and knowledge of the flight crew. We are totally in your hands. From my background I do understand technically what happened in these accidents and to me, from a technical perspective, it seemed that they should have been recoverable. Despite the challenging human factors environment.
Then finally after 3 months, L39 comes along and tells me exactly what I was looking for in all these posts. If I may repeat your words "....These MCAS accidents were all recoverable......" When I board a flight I need to feel confident that the crew in the cockpit can fly me (and themselves) to our destination safely. After L39's post my confidence is now coming back. I just hope that the confidence and preparation L39 exemplifies is widespread beyond just the people who post here.
Thank you L39 once again.
EPHD75 is offline  
Old 20th Apr 2019, 16:57
  #4172 (permalink)  
 
Join Date: Feb 2000
Location: Pacific
Posts: 725
The real problem is not the MCAS system. It is the lack of experience and ability of the pilots to handle a simple mechanical failure. That is their job; the last defense against disaster. There were many ways the pilots on those airplanes could have safely landed their aircraft but they failed. As we replace experienced pilots with newbies who only see the career of an airline pilot as a job that is easier to get than that of a doctor or engineer and pays more with more time off, we will see the same errors and failures here too. I forecast that our safest days in aviation are behind us and we will see a strong uptick in pilot error accidents. We should be using this MAX8 affair as a wakeup call to improve the skills and experience of our flight crews. If we cannot train pilots and give them a chance to gain experience before they start sitting in the left seat they will be what we are producing now: Button Pushers.
boofhead is offline  
Old 20th Apr 2019, 17:04
  #4173 (permalink)  
 
Join Date: Nov 2018
Location: Vancouver
Posts: 52
Originally Posted by L39 Guy View Post
I think your analysis and continuum of blaming the pilots/Boeing and everyone else harmless to blame Boeing and everyone else/pilots harmless is a nice summary...snipped...

Only 2 of the 3 non-US MCAS incidents saw the pilots recognize the UAS and do something about it.
I'm confused had there been any US MCAS incidents?? Can you at least cite me one case?

Originally Posted by L39 Guy View Post
...And, as it turned out, the crew that did execute the UAS drill were the ones that ultimately saved the aircraft.
I will just concentrate on Lion Air Flight JT610.
How do you know the CAPT on that flight didn't perform UAS NNC? Did you get a hold of the CVR's transcript? We know there wasn't any CVR's transcript on Lion Air JT610 Preliminary Crash Investigation because they still haven't recovered the JT610's CVR at that juncture. The CVR was buried beneath a thick mud of Java Sea for almost 60 days before they finally recovered it. And, no official transcript has been released by the officials thus far.

Furthermore, the crew had been briefed about the malfunctions and the fixes by the MX on the ground, plus there was a log left by the previous crews [Flight JT043] written as:

"Airspeed unreliable and ALT disagree shown after takeoff, STS also running to the
wrong direction, suspected because of speed difference, identified that CAPT instrument
was unreliable and handover control to FO. Continue NNC of Airspeed Unreliable and ALT
disagree
. Decide to continue flying to CGK at FL280, landed safely runway 25L."


Wouldn't you think the first thing flashed on the CAPT's mind should there be trouble with the aircraft is to recall the previous flight's log entry which explained, among other things, the crews' execution of the Airspeed Unreliable NNC and also ALT disagree memory items?
patplan is offline  
Old 20th Apr 2019, 20:55
  #4174 (permalink)  
 
Join Date: Mar 2019
Location: Bavaria
Posts: 17
Originally Posted by MemberBerry View Post
Just like there seems to be a deficit of pilots in the aviation world, I think that generally there is a deficit of good software developers, and it's getting worse. I think the quality of software took a nosedive during the last decade. Software from a decade ago was way more polished than what I see today, and this is very frustrating.
Please, don't make the mistake and put your best programmers on functional safety coding. Trust me, they will quit!

That's the whole point about safety: Even if a brain-dead, a greenpeace airplane-hating terrorist or an ape would program the code, you would find out before the first passenger boards the plane. There is a complete description of the functionality within several layers of requirements and every requirement has it's own validation criterion, test cases and in the end you have 100% test coverage on software module, software system, system, item (flight control) and vehicle/airplane/machine/... level.
As sad as it seems, the MCAS software did work exactly as specified. No programmer to blame.

At the time of the initial safety analysis someone defined 0.6° as the max. impact this system is allowed to have.
At this time someone wrote or should have written a requirement specifying that MCAS must never turn the trim by more than this (together with a minimum time in case of repeated action).
Every final software, every MCAS SW-module, every configuration should have been checked against this requirement and validation criterion by appropriate automated and documented test cases.
That's how safety works!

Now how did it happen? I don't know:
a) They knew but management forced them -> safety culture problem, on purpose...
b) They never wrote the tests -> process problem, nobody should release something for series without finishing all test runs
c) They ignored the test results -> safety culture problem, process problem
d) They changed the requirement after changing the SW but did not touch the safety analysis -> tracability problem, that's what ALM software systems are made for. Of course you have a problem if 99% of your requirements are blueprints from 1968... (-> grandfather rights, which should not apply to anything which is not 100% proven in use within exactly the old configuration)
e) The trim motor was supposed to turn slowly (10 sec -> 0.6°), instead it turned 4 times faster -> Item level testing, hardware in the loop testing?
f) They never wrote down their analysis assumptions as requirements -> fire the system requirement engineers, not the programmer. Ask yourself what your reviews are worth, are your reviewers just interested in the cookies/donuts?
etc.

Safety programming is brain-dead translation of UML into code. There is absolutely no room for creativity or interpretation. The main job within safety is sitting (+thinking, not just physical presence) in reviews and questioning every single line somebody wrote as a requirement (on functional, system, SW system or SW module level) several times. Do never rely on the genius of a requirement author, programmer or test engineer. Everything goes thru several reviews, accessments etc.

Second problem, which worries me the most, is the use of just one input. There are 2 sensors, use them! Relying one only one probe with very low diagnostic coverage is just bad. Safety-critical systems should be single-point-fault tolerant. But this is also a technical system requirement. Such a decision is made 6 months before coding. Nobody questioned this?
If this was, as claimed earlier within the thread, a commercial decision to avoid training on a simple diagnostic message, then the safety culture went down the drain, flushed by commercial interest (since it was also claimed that the safety level was estimated high enough to require redundancy). Such a finding would put a question mark on every difference between NG and MAX.
The very sad thing is that this AoA sensor comparison seems to be implemented and working, but sold as an extra. Maybe Boeing just wanted to earn extra money but on the other hand there is this strange coincidence that this feature compromises the sales argument ('no training') on one hand and does rescue airplanes on the other ('If you buy the sensor comparison for peanuts, it's your fault if you need to train your pilots'). -> Before talking about 'better' US/European pilots, better check if they simply had the option installed and where surprisingly not overwhelmed by a simple message instead of a spinning trim wheel

Safety engineers are not very popular within companies because this process takes time and often delays a development a lot if done as required. There is no space for agile programming, scrub etc. It's a V-Model lifecycle at it's best.
Guess what Boeing didn't have during the MAX development?

Oh, and one fun fact about european rail safety: Signals are a fail-safe system. In case something goes wrong, all signals are red, every train stops. Then there are operator which can manually override signals after making sure (by phone) that only one train is on one rail a a time. Fine...
Still, the automatic system and collision avoidance have availability requirements, they are required to work most of the time. The reason is simple: humans are error-prone, in case the signals would be operated manually for more than a very limited time, the average error-probability would be too high.
An autopilot is far more reliable than a pilot, pilots would make more mistakes (That's why there are two). Still the pilot has to be trained to every situation and manual flying. Maybe (non-pilot assumption) there is just not enough costly simulator training? Why would you (risk to) do the training with passengers on board?
Oh, I forgot, simulator time is expensive...
TryingToLearn is offline  
Old 20th Apr 2019, 21:24
  #4175 (permalink)  
 
Join Date: Apr 2019
Location: USA
Posts: 217
Originally Posted by TryingToLearn View Post
Second problem, which worries me the most, is the use of just one input. There are 2 sensors, use them! Relying one only one probe with very low diagnostic coverage is just bad. Safety-critical systems should be single-point-fault tolerant. But this is also a technical system requirement. Such a decision is made 6 months before coding. Nobody questioned this?
I've pondered this question myself quite a bit. I'm not sure we will ever know the correct answer, but let me offer an observation. While the 737 has a lot of redundancy, that redundancy does not generally extend to two sensors coming to an agreement before one of them causes a system response.

The most obvious example is that if one stall computer (SMYD) senses an approach to stall condition, it will turn on one stick shaker and activate the Elevator Feel Shift Module (EFSM). I believe one SMYD can also activate the Speed Trim Stall ID function and the autoslats (I'm actually trying to confirm these last two. The aircraft maintenance manual (AMM) suggests this is the case, but I haven't found anyone at my company who can say for sure). If the left and right inputs disagree, you will get some kind of message, but the system response still occurs.

I could envision a scenario in which someone on the MCAS design team looked at how previous 737 models treated these system inputs and simply followed suit. The difference this time was the system response was more than an annoyance - it was, sadly in hindsight, an existential threat.

Last edited by 737 Driver; 20th Apr 2019 at 21:26. Reason: clarity
737 Driver is offline  
Old 20th Apr 2019, 21:48
  #4176 (permalink)  
 
Join Date: Mar 2019
Location: Bavaria
Posts: 17
Originally Posted by 737 Driver View Post
The difference this time was the system response was more than an annoyance - it was, sadly in hindsight, an existential threat.
In case of an important 'warning' function you may want something fail-operational, your safe state is 'warn'. So you place an 'OR' logic within your redundancy.
In case of a -maybe dangerous- reaction and a fail-safe ('better do nothing') system, AND is the only solution.

Next question, where is MCAS?
a) Do you need fail-operational performance? Is this 'feel' in case of being close to stall very important? ->OR
b) Do you need to be fail-safe? Is a wrong activation critical? -> AND
c) Both? -> 3 Sensors, 2oo3 reaction, 3oo3 maintainance message
d) nice feature, doesn't do any harm? -> single sensor
There's no rocked science behind such systems.

But even if this system would rely on a single sensor. Range checks are also a valid method.
Close to stall at 75°AoA? Oh, we may need to adjust the feel a bit?!?
Even a 'no-brain' range plausibilization within the 1 sensor would have rescued one of 2 planes (fun fact: Such a range check is considered a 'low-coverage' method in automotive, estimated to catch 60% of all errors (ISO26262 part 5 annex D sensors...)).
TryingToLearn is offline  
Old 20th Apr 2019, 21:53
  #4177 (permalink)  
 
Join Date: Nov 2018
Location: Planet Earth
Posts: 7
Originally Posted by 737 Driver View Post
I've pondered this question myself quite a bit. I'm not sure we will ever know the correct answer, but let me offer an observation. While the 737 has a lot of redundancy, that redundancy does not generally extend to two sensors coming to an agreement before one of them causes a system response.

The most obvious example is that if one stall computer (SMYD) senses an approach to stall condition, it will turn on one stick shaker and activate the Elevator Feel Shift Module (EFSM). I believe one SMYD can also activate the Speed Trim Stall ID function and the autoslats (I'm actually trying to confirm these last two. The aircraft maintenance manual (AMM) suggests this is the case, but I haven't found anyone at my company who can say for sure). If the left and right inputs disagree, you will get some kind of message, but the system response still occurs.

I could envision a scenario in which someone on the MCAS design team looked at how previous 737 models treated these system inputs and simply followed suit. The difference this time was the system response was more than an annoyance - it was, sadly in hindsight, an existential threat.
I remember someone explaining that if they went with two sensors, the system would have to notify on disagreement and generate a warning, and that would need extra training and/or somehow impact the constraints under which the MAX was being designed.

I don't remember who said that and how accurate that is. Because if true, that raises the question: if the modified MCAS now accepts two inputs and self-disables on disagreement, is the no-training card already lost? With the penalty millions of certain airlines?
robocoder is offline  
Old 20th Apr 2019, 22:17
  #4178 (permalink)  
 
Join Date: Aug 2002
Location: Switzerland, Singapore
Posts: 1,305
I repeat my post again gladly, where I wrote that it's not the MCAS which was the main problem but the data gathering unit (I'm not a B guy) from the AOA probe to the flight control computer. People insist that it was a probe failure, but this is highly inprobable. 2 probe failure on a new plane within months, on top of all the MCAS incidents on US airliners. I have never heard of a AOA probe failure, it happens very rarely.
It must have been the AOA data that was corrupted, not the AOA probe itself.
Only then MCAS made the mixup with the data (rubish in, rubish out). MCAS reacted as programmed, it received the wrong data.
Dani is offline  
Old 20th Apr 2019, 22:23
  #4179 (permalink)  
 
Join Date: Apr 2019
Location: USA
Posts: 217
Originally Posted by TryingToLearn View Post
In case of an important 'warning' function you may want something fail-operational, your safe state is 'warn'. So you place an 'OR' logic within your redundancy. In case of a -maybe dangerous- reaction and a fail-safe ('better do nothing') system, AND is the only solution.

Next question, where is MCAS?
Again, just speculating. According to initial reports, MCAS was only supposed to make a 0.6 nose down input. That is entirely manageable and wouldn't pose a threat, particularly if it occurred only once. Multiple 2.5 degree nose down inputs is an altogether different story. We don't why this change to MCAS authority was made, but apparently someone didn't connect the dots.
737 Driver is offline  
Old 20th Apr 2019, 22:54
  #4180 (permalink)  
 
Join Date: Apr 2019
Location: USA
Posts: 217
Originally Posted by Dani View Post
I repeat my post again gladly, where I wrote that it's not the MCAS which was the main problem but the data gathering unit (I'm not a B guy) from the AOA probe to the flight control computer.
I would have to look at the system architecture to be sure, but I'm almost positive that the AOA output does NOT go direct to the FCC. AOA is an input to both the Stall Management Yaw Damper (SMYD) computers and the Air Data Inertial Reference Units (ADIRU's). AOA is just one input into the SMYD's and ADIRU's. A bad pitot tube input could also have generated a false stall signal.

The primary responsibility of generating an approach to stall signal belongs to the SMYD which then activates various other aircraft systems (including the FCC) in response to the impending stall. I believe MCAS is a subroutine within the FCC. As I posted earlier, in many respects the two SMYD's act independently requiring only one "vote" to activate various systems, though there will be some other alerts to indicate the disagreement.


People insist that it was a probe failure, but this is highly inprobable. 2 probe failure on a new plane within months
Not sure if you are talking about the two AOA failures at Lion Air, or the two failures at two different airlines. At Lion Air, the accident aircraft had a defective AOA on a previous flight which was then replaced. We don't yet know why the first AOA was defective. Perhaps a ground worker bumped a piece of equipment into it. We do have evidence that the replacement AOA was 20 degrees out of calibration. Its DFDR readout exactly paralleled the good AOA, just 20 degrees higher. How this came to be is one of the subjects of the investigation. At Ethiopian, the DFDR data suggests that the AOA was working during the takeoff run, but was disabled shortly after liftoff, possibly by a bird strike.


I have never heard of a AOA probe failure, it happens very rarely. It must have been the AOA data that was corrupted, not the AOA probe itself. Only then MCAS made the mixup with the data (rubish in, rubish out). MCAS reacted as programmed, it received the wrong data.
AOA failures do happen, but until recently they did not result in a major accident. The DFDR records the AOA output, so we really do know that it was a faulty AOA, though the reason for the failure was different in the two accidents. Again, it wasn't MCAS's job to determine the stall condition. That responsibility belongs to the SMYD's. The problem was that it only took one SMYD to activate MCAS.
737 Driver is offline  

Thread Tools
Search this Thread

Contact Us Archive Advertising Cookie Policy Privacy Statement Terms of Service

Copyright © 2018 MH Sub I, LLC dba Internet Brands. All rights reserved. Use of this site indicates your consent to the Terms of Use.