PPRuNe Forums - Ethiopian airliner down in Africa

PPRuNe Forums (https://www.pprune.org/)

- Rumours & News (https://www.pprune.org/rumours-news-13/)

- - Ethiopian airliner down in Africa (https://www.pprune.org/rumours-news/619272-ethiopian-airliner-down-africa.html)

Many thanks to you all for your honest and concise replies

Quote:

Originally Posted by 737 Driver (Post 10465013)

Continuing the Threat and Error
What should one do when a barrier actually becomes a threat?

If you only have the runaway trim nnc , but now there is a crash few months earlier and some vague ad from the manufacturer.
Which might flag uas or it might flag as and it might leave you aircraft in a state you cant manually trim it back.
There is nowhere in the nnc saying if trim goes weird after raising flaps drop flaps back and reduce power.

Murphy's correction - Thanks.

Quote:

Minor correction, after a couple of false starts the autopilot was engaged for more than 30 seconds, just long enough to provide a false sense of "not that bad"?
Then it all hit the fan on short order with AP disconnect followed by MCAS.

I had the AP time on as 3 seconds. Corrected. Indeed, the real 30 seconds it would give time for a feeling of having overcome the problem - until the 9 seconds of trim. But looking back again, it might be that feeling of success, coupled with the fact STS runs the wheels (albeit briefly) anyway, that made him miss the sheer length of the run time. Hard to imagine missing that clunking, but the Stick Shake is quite loud, and as we've discussed, very distracting.

Let's face it. Than run time of 9 seconds, the lack of sustained ANU via the electric trim and the power so high are the main indicators of his state of mind. It's a terrible shame that he'd not got more height as I've a feeling he was starting to go down the right logic route. But only just starting, and coping with too much of a handful to really focus.

Yes, AVIATE comes first, and it's really shouting loud that the stresses were drowning what skill he had.

Quote:

Originally Posted by meleagertoo (Post 10465237)

And as such, why is it really so necessary to inform pilots of this system? There is no specific control over it, just the generic runaway trim procedure.

Isn't that exactly the problem (next to MCAS operation relying on a single AoA vane naturally)?
Obviously, in case of the 737 MAX MCAS accidents there was a lack of a clean manual override path similar to that present in case of the 737 NG STS and/or allegedly the MCAS variant installed on the KC-46 tanker where, in both cases, the automatic trim procedure could be overridden by manual column input? Had that been in place in case of the 737 MAX MCAS along with appropriate pilot training and full disclosure of the new system(s) and changes, I dare say we would not be having this lengthy thread here.

Wasn't Boeing's design philosophy supposed to be "pilot can always override automation"? And why was it so poorly respected in this instance as opposed to the cases when similar systems were introduced by Boeing in the past ? These are the questions one truly needs to raise to assess the "what went wrong here?" conundrum.

Quote:

Originally Posted by 737 Driver (Post 10465032)

I agree it is an incomplete picture (which I did acknowledge), but there are some broad enough outlines from which we can draw some conclusions. If anything comes out that substantially alters our current understanding, then I'll be happy to make a correction.

As far as what was going on while the the trim switches were in the cutout position, are you referring to the gradual movement from 2.3 to 2.1 units? It apparently occurred over two and half minutes. I'm interested in seeing what the board's thoughts are on that as well, but I should point out that in the context of the overall trim movement, it is a very small and slow creep.

I was actually wondering more about what was discussed and actioned during that time. Two and and half minutes is long enough for the initial startle factor to dissipate, hopefully some insight can be gained into pilots actions during the preceding critical time.
The prelim report mentions only one attempt at manual trim at 05:41:46, roughly half way through the cutout period, surely there was other activity during that 150 seconds.

One possibility is that the trim creep was due to attempts at manual trim causing a bounce in the cables that each time resulted in slight movement in the wrong direction. In the mentour pilot video you can see this bounce as attempts are made.

Another possibility is that one of the brakes is not holding against the load but that would be a seperate failure/design flaw that is probably not needed to explain the traces.
Access to the raw FDR data should resolve this since if it was a slipping brake it would likely be continuous whereas manual trim efforts would be seen as (slight) steps with pauses.

Quote:

Originally Posted by meleagertoo (Post 10465237)

Which rather vindicates Boeing's position on this; they reacted exactly as Boeing intended by identifying it as an STS runaway (which most assuredly is a runaway trim event) and dealt with it by using the correct pre-existing technique.

And as such, why is it really so necessary to inform pilots of this system? There is no specific control over it, just the generic runaway trim procedure. Surely telling people about systems they have no specific influence over is merely muddying the waters? If it presents itself as failure event X which is dealt with by checklist Y does anyone need to know that it could be system A or A.1 at fault, when both are addressed by the same checklist, show effectively the same symptoms and actually are components of the same system?

That, I am sure, was Boeing's rationale and though I'm not 100% comfortable with it I'm certainly not condemning it in the absolute and fundamental way some others are.

Except it was the jumpseater that identified the issue NOT the crew and it seems that neither the crew or the jumpseater understood what the issue was. No mention of stab trim runaway was made in the writeup as I recall.

Quote:

Originally Posted by rog747 (Post 10465107)

Are there any contributors here who are 737 pilots who transitioned to the MAX?

May I ask please,
If you did, did you have any SIM, classroom, or Line training on the MAX and it's differences, or was it purely on-line modules, thus was your first flight on a ''pax on board'' flight?

Were you made fully aware of the adverse pitch up changes, and CG issues of the new MAX due to the design enforced forward location of the new larger engines (which can now cause lift) at low weights/high power applications resulting in a (unrecoverable?) high AOA? (which we now are aware, necessitated the MCAS software patch)

Were you (before the 2 fatal and 1 nearly accidents) fully informed/trained on the new MCAS systems and it's functionality, implications, and what to do if it went rogue?

Thanks.

MAX was added to our fleet of NG's about a year ago. All training was either online or bulletins pushed to our Ipads. There is a quick reference card in the cockpit with key reminders. I had a couple of opportunities to fly the MAX before it was grounded. It actually flies very nicely, and the only real issue for me was that some of the switches and indicators were in different places. It would be comparable to transitioning from a 2001 Ford F-150 to a 2019 model. Drives pretty much the same, some new bells and whistles, some new switchology for the radios and climate control, but still a Ford F-150.

Our company continually stressed that the transition would be relatively straightforward, and to a certain point that was true in the context of normal operations. However, my contention always was (and this is not 20/20 hindsight) that any issues with the MAX would be less a case of normals operations, but rather non-normal ops. As we have seen in aviation time and time again, it is very difficult to predict all the unique failure modes that may arise with a new aircraft. Given that, my concern with the MAX was not with adapting to any differences when things were going right, but rather how different it might be when things were going wrong. Sadly, those concerns were not misplaced.

Boeing's biggest mistake was design not underestimating the public

Quote:

Originally Posted by meleagertoo (Post 10465237)

Boeing's big 'mistake' was to underestimate the public and to some extent the industry's interpretation of two failures due almost exclusively to bad handling and incorrect procedures that they could hardly have anticipated. At least, Boeing thought they could hardly have been anticipated at the time, and I doubt (m)any of us would have thought otherwise either before these accidents had we known about the system. Their mistake was to underestimate the amount and volume of criticism that would unexpectedly come their way because crews, maintenance and at least one airline screwed up in spades and the world retrospectively devined faults therefrom in Boeing that no one had thought were faults before and in a vindictive and vitriolic way unprecedented in the history of aviation..

I am not a pilot so my view may not be correct but I do design systems with functional safety requirments and I profoundly disagree with this. A system which cannot tolerate a single fault without entering a dangerous state which requires prompt action to prevent a catastrophe is not safe paticularily when at least one of the failures can occur in a high workload situation, must be responded to within a time limit and will generate misleading and distracting warnings. I am confident that I and all the teams I have worked in would have anticipated this would cause problems and would not have considered it an acceptable design.

Yes we are all human and may overlook failure modes with common causes or fail to understand complex interactions between sub-systems but this was just straightforwardly poor design which should have been identified as such.

The idea that Boeings big mistake was 'to underestimate the public and to some extent the industry's interpretation of two failures' is shockingly callous given the death toll and relatively small timespan. As far as we know the scenario concerned has occured three times and only been survived once and then perhaps a little fortuitously.

Quote:

Originally Posted by PiggyBack (Post 10465432)

There are many systems on an aircraft where one failure can cause entry to a "dangerous state".

MCAS was designed to be easily disabled by simply trimming the aircraft. There is no prompt action required. All that is need is for the pilot to FLY THE AIRCRAFT just as they were taught in their very first lesson. ATTITUDES and MOVEMENTS

Pilots are taught to always control the aircraft and to TRIM the aircraft to maintain that control. If the aircraft is not doing what you want it to, it is up to the pilot to MAKE it happen.

The MCAS "problem" is just a form of un-commanded or un-wanted trim. In addition to being a memory item, it is also just common sense to disable a system that is not performing correctly. In this case MCAS was causing nose down trim. If repeated nose up trim did not stop the unwanted nose down trim, turn off the electric trim.

Problem solved.

You can't really blame Boeing any more than you can blame Airbus for not predicting that the AF447 crew would forget that you need to lower the nose to unstall an aircraft, or that Airbus had designed the side sticks so that they cancel each other out.

The Refrain of Every Lousy Programer

Quote:

Originally Posted by PiggyBack (Post 10465432)

Everyone who writes lousy software has the same excuse, blame the user.

Threat and Error Management

Part 4

Continuing the Threat and Error Management discussion..... If you are just joining this sub-topic, please go back to the first post with the TEM graphic (Part 1)

First, a quick refresher. There are three components of the TEM model that are relevant here:

Threats are external and internal factors that can increase complexity or introduce additional hazards into a flight operations. Weather, unfamiliar airports, terrain, placarded aircraft systems, language barriers, fatigue, and distraction are examples of threats. Once a threat has been identified, the crew can take steps to mitigate that threat.

Errors are divergences from expected behavior caused by human actions or inaction that increase the likelihood of an adverse event. The difference between an error and a threat is that an error can, with careful attention, be quickly identified and crew members can find prompt solutions to the error. This is sometimes known as "trapping" the error. Untrapped errors can turn into new threats.

Barriers are structures, procedures and tools available to flight crew to trap errors and contain threats. Since no barrier is perfect, the goal is to build sufficient barriers so that all threats are contained and all errors trapped. Untrapped errors and uncontained threats can ultimately lead to an undesired aircraft state, incident, or accident.

The TEM model assumes that there are no perfect aircraft, perfect environments, or perfect humans. The goal is not to create a flawless system, but rater a resilient system.

The standard TEM model lists these available barriers for flight deck operations: Policies and procedures (SOP's), checklists, CRM, aircraft systems (particularly warning and alert systems), knowledge, and airmanship. Knowledge and airmanship are related to not only to training and experience, but also to an individual's commitment to develop their knowledge and airmanship. CRM includes such things as crew communications, monitoring, flight deck discipline, assignment and execution of specific duties. The Captain is the primary driver behind CRM, but the First Officer has obligations here as well.

In Part 3 of this series, I used the TEM model as a lens to analyze where and how the existing barriers failed. The primary reason that multiple barriers failed is that the effective employment of virtually all of these barriers depends heavily on the mental states of the two pilots. SOP's, checklists, CRM, knowledge, and airmanship only work as barriers when the crew can actually draw on them. It is unclear how much of this failure was due to lack of particular knowledge and/or skill as opposed to the inability to draw on existing knowledge and/or skill under pressure. There are indications that the Captain had achieved cognitive overload. This might have also applied to the First Officer, but we must also acknowledge that the FO had far less experience to draw on and may have had discomfort in speaking up. I believe one of the key takeaways from this accident is to appreciate the critical role of the First Officer in safe aircraft operations. A First Officer must not only be able operate the aircraft, run the checklists, and demonstrate knowledge of systems and procedures, he must be able to act as an effective barrier to trap not only his errors, but also the errors of the Captain.

When the traditional barriers failed, they effectively became new threats. These threats were subsequently uncontained and allowed errors to go untrapped leading ultimately to a hull loss and the death of all passenger and crew.

I ended Part 3 with the following question: What should one do when a barrier actually becomes a threat?

I'll be the first to admit that the "barrier as threat" is a bit different take on the TEM model, but I believe it is both valid and useful. From practical experience, I think TEM theory sometimes assumes that barriers are more resilient than they really are in practice and largely ignores the possibility that what was meant to be a barrier could actually become a threat.

However, by adopting a "barrier as potential threat" perspective, the TEM model actually provides some useful guidance. Threats should be identified or anticipated and steps should be taken to mitigate and contain those threats.

The key step here is awareness of the threat, or more specifically, awareness that what was initially considered a barrier might actually become a threat.

Let's go back to that list of potential barriers for flight deck operations - Policies and procedures (SOP's), checklists, CRM, aircraft systems, knowledge, and airmanship - and consider how these "barriers" might actually become threats.

Policy and procedures - I believe most airline SOP's provide useful barriers to the degree that the flight crew actually uses them. However, in some situations those policies may create unappreciated threats. For example, does the airline's policy drive an over-reliance on automation by mandating its use at all times? Do existing policies require/encourage Captains to do most of the actual flying leaving the First Officer ill-equipped to serve as an effective back-up? Do airline policies and/or culture create or sustain a steep authority gradient which discourages First Officers from speaking up or correcting errors by the Captain?

Checklists - Are the checklists (normal and non-normal) well designed? Do they help trap likely crew errors? If a crew member believes a checklist contains a potential threat, how amenable is their airline to modifying that checklist?

Crew resource management - Is the level of knowledge and proficiency of your Captain/First Officer sufficient to be an effective barrier? Is yours? Do the pilots use effective communication and social skills? Do they maintain cockpit discipline? Do they feel free to speak up and correct each other without creating tension?

Knowledge and airmanship - Does the crew receive the right kind of training to be effective? (Just refer to the "mantra" discussion if you need to be reminded of my position on this). Does that training prepare the crew for the known as well as the unknown? Does that training help mitigate the well-known startle and fear reflexes? Does that training emphasize systems management at the expense of basic aircraft skills? Does that training emphasize the need for the execution of NNC in a methodical and deliberate manner?

As we go through this list of questions (please add more if you like), we can develop a picture of where these barriers may actually morph into threats.

Once these new threats are identified, the next step is to attempt to mitigate those threats.

To be continued.....

The problem with TEM is that it tends to encourage linear thought - actions will create the desired resolution. I spent some time in the UK RAF where we often quoted the Boyd`Cycle (OODA Loop) which was more of a circular decision making process - think DODAR. The advantage of the Boyd Cycle is that you review the efficacy of your actions and then, potentially, choose additional or even different actions.

Of course, such flexibility and decision making (including potential divergence from checklists) requires experience and deep theoretical. knowledge. In that area I think we all agree that aviation is struggling, not just due to the training system but also due to the manufacturers not telling the full story.

People quote Sully as an example in that he ‘got the job done’ regardless of checklist.

To be continued..... Oh please no.

Is it really necessary to explain the complete TEM concept, to use this model to fit the few facts that are available, or is it that the facts are fitted the model in order to understand an individual’s (preconceived) viewpoint.

‘All models are wrong, but some are useful’ (George Box). The value of a model, like a tool is to select the appropriate one and know how it should be used; particularly its limits.

If you start with the human as a threat then you will conclude human error; alternatively starting with the human as an asset, pilot, designer, regulator, then with open thought, guided by a model, it may be possible to identify influencing factors, which in combination enabled the outcome.

Limitations of TEM Model
Assumes technical competency appropriate for role.
The threat-error-undesired states relationship is not necessarily straightforward and it may not always be possible to establish a linear relationship, or one-to-one linkage between threats, errors and undesired states. e.g. threats can on occasion lead directly to undesired states without the inclusion of errors;
and operational personnel may on occasion make errors when no threats are observable.
Essentially a ‘deficit’ model.
Benchmarks against a standard ‘safe’ or ‘safe enough’ i.e., other operators.
Descriptive: It describes an outcome or end state not how to get there.
Little focus on minimisation of error
Links the management of threats and errors to potential deficiencies in HF & NTS skills but not the processes supporting good TEM behaviour.
Same challenge as ‘Airmanship’
(https://www.casa.gov.au/sites/g/file.../banks-tem.pdf)

Quote:

Given that, my concern with the MAX was not with adapting to any differences when things were going right, but rather how different it might be when things were going wrong. Sadly, those concerns were not misplaced.

Sadly, those concerns WERE misplaced.

There are the legacy commands that line up, not necessarily under non-normal ops. Look what happened when, what was it V10 of the HW FMS software came out? That one didnt last long.

The if/then sequence of commands can get one to a line in the code that has been long forgotten. A few that come to mind are the balked TOGA with a bounce, or after crossing a FO waypoint, the ac porpoises down to the AA level of the next waypoint, and of course, the lookup finding a simple radius of the Earth instead of the Geoid.

Unintended consequences of legacy programming. I would love to see a V1.0 of the FMS.

Quote:

Originally Posted by safetypee (Post 10465550)

‘All models are wrong, but some are useful’ (George Box). The value of a model, like a tool is to select the appropriate one and know how it should be used; particularly its limits.

I agree. The TEM model has its limitations, but it also has its uses. One of its primary benefits is that is a key part of the language of aviation safety. Pilots are usually on the receiving end of this dialogue. I submit that it can be pointed in the other direction.

Quote:

Originally Posted by Cows getting bigger (Post 10465545)

If you go back and look at the original graphic, you will see that it does incorporate a cycle of input/output/review. I'm familiar with the Boyd Cycle, and it is appropriate in some circumstances, but it is less useful in setting up a resilient system in the first place. The OODA loop is more applicable once you are responding in the environment that has already been established.

Yep, like an aircraft trying to kill you when you've lost your way through process. :)

Just another reiteration of some issues with MCAS' flawed logic, as discussed here and elsewhere... (with my emphasis)

Boeing says no flaws in 737 Max. Former engineer points to several

Quote:

...Boeing CEO Dennis Muilenburg said the planes went down because of a chain of events.

"One of the links in that chain was the activation of the MCAS system because of erroneous angle attack data," he said at a recent news conference.

Peter Lemme, a former Boeing engineer and former FAA designated engineering representative, said MCAS is the main link. The flaws in that system, he said, need to be addressed...

***First, MCAS activated because of a single sensor with a false reading. On the Ethiopian jet, one indicator swung from showing a normal ascent to showing a steep ascent. Lemme said in that case it was a clear sign of failure.

"Having the vane change from 15 to 75 degrees in two seconds — it is immediately an indication of a fault. There's just no physical way to do that," he said. "And then 75 degrees is kind of a ridiculous number."

But MCAS acted on it, even though a sensor on the other side of the plane reported everything was fine.

"That was a big disappointment. If the systems had declared the signal failed then MCAS would not have fired and nothing would have happened," he said.

***Both planes were flying at a great speed when they crashed — another flaw, according to Lemme because MCAS should have stopped at that speed.

"There is no way to stall the airplane at that airspeed and MCAS should have had logic in place that would prohibit it from operating," Lemme said.

***The Lion Air flight pitched forward more than 20 times before that plane crashed into the sea. That is the greatest flaw in MCAS, Lemme said: the repeated descents.

"It persistently attempted to move the stabilizer down without giving up. I think if MCAS hadn't had the repeated feature where it could re-trigger, we probably would have been OK," he said.

Lemme said testing should have caught the problems with MCAS.

"That should have been found. You would expect the test program would look at the likely failure modes," he said. "That is a breakdown in the test program."...

- https://www.kuow.org/stories/engineer-gap-flaw-mcas

And the source of the above article is very well-respected as one of Seattle's NPR radio stations, KUOW, so should be taken seriously.

Quote:

Originally Posted by Lost in Saigon (Post 10465468)

It may be interesting to note that what appears to be the vast majority of people who are responsible for designing, developing and delivering safety critical systems for a living (I am another example - high software content military life critical systems amongst other things) who have commented find the Boeing approach at best questionable, and for my part very concerning (as a very regular pax). I had expected better from the aviation regulation process.

Equally concerning are the folk that fly these machines who also appear to feel that this type of potentially inadequate (and demonstrably dangerous) systems design is acceptable, it may be the norm, and it may be what you are used to . . . but I'm surprised . . .

Edit : A wise man in the military safety community once told me that if I wasn't personally prepared to trust my life to the system I designed I shouldn't be in the industry . . . I wonder whether that ethos has been diluted in aviation . . . I hope not . . .

Fd