Boeing 737 Max Recertification Testing - Finally.
Join Date: Dec 2019
Location: OnScreen
Posts: 410
Likes: 0
Received 0 Likes
on
0 Posts
One of the many, many reasons, this whole B737 should never have reached the MAX version. This looks "hindsight", though a proper engineering company would understand the whole B737 design had become obsolete and should not be given another life-extension.
Join Date: Dec 2019
Location: OnScreen
Posts: 410
Likes: 0
Received 0 Likes
on
0 Posts
The instructions only worked, when it was immediately recognized what the issue was. After that, the airplane was doomed.
The Boeing instructions not even considered the aspect of delayed recognition of the problem.
You know, there are reasons why the B737MAX was grounded for 18+ months (and in China 2+ years):
THERE ARE NO SUITABLE INSTRUCTIONS POSSIBLE to realistically overcome the AoA vane mishap.
IIRC later on, in a flight simulator, experienced pilots did have to react within 4 seconds after an AoA vane mishap, to be able to save the aircraft.
No, it wasn't.
The instructions only worked, when it was immediately recognized what the issue was. After that, the airplane was doomed.
The Boeing instructions not even considered the aspect of delayed recognition of the problem.
You know, there are reasons why the B737MAX was grounded for 18+ months (and in China 2+ years):
THERE ARE NO SUITABLE INSTRUCTIONS POSSIBLE to realistically overcome the AoA vane mishap.
IIRC later on, in a flight simulator, experienced pilots did have to react within 4 seconds after an AoA vane mishap, to be able to save the aircraft.
The instructions only worked, when it was immediately recognized what the issue was. After that, the airplane was doomed.
The Boeing instructions not even considered the aspect of delayed recognition of the problem.
You know, there are reasons why the B737MAX was grounded for 18+ months (and in China 2+ years):
THERE ARE NO SUITABLE INSTRUCTIONS POSSIBLE to realistically overcome the AoA vane mishap.
IIRC later on, in a flight simulator, experienced pilots did have to react within 4 seconds after an AoA vane mishap, to be able to save the aircraft.
We may need to go back to the AOM and AD to confirm exactly what the exact wording was, however, the system logic, flawed as it was, would not be unrecoverable if the crews had been given enough information to counter a runaway trim in the first instance with a manual trim input to ensure that the trim returned to a normal range, before the STAB CUTOUT SW....... CUTOUT was selected. If the first action was to go to CUTOUT before putting the aircraft into an "in trim" condition, then, yes, it was possible that the speed, trim and residual elevator authority in conjunction with the minimal torque available by the manual trim wheel for a severe out of trim case, would result in a stabiliser that would overpower the manual trim system unless the elevators were unloaded, a technique which is pretty exciting for the passengers and pilots alike to see the world big in their windows. IIRC, the AD included comments related to being in trim, but did not at any time expand on the criticality of that action, and the industry awareness of the limitation of 60's accepted trim architecture that barely had a fully compliant backup in the absence of the knowledge related to being out of trim v manual trim torque constraints.... this was an ill considered document, and had potential to result in a bad outcome, blaming the ET302 pilot for merely being a pilot and not being Tex Watson is hardly the standard of excellence that the Old Boeing, pre contamination with MDD management, "The New Boeing" the one that sacked QA managers for doing their job, that managed systems that resulted in fasteners on the B787 being a different size on the east coast and west coast... (Coriolis?) who gave the MCAS, the KC46 debacles, VOL I and VOL II, and generally messed up a proud company... yeah.
The weasel words applied did not make it easy for a crew confronted with a change that was very recent and which had not been trained or explained in depth to the flight crew.
The evidence is that of the 3 events, the one the sector prior to Lion Airs splash, and the ET302 one, where 2 of those beat the driver, and one driver set had the new instructions, it would seem to have been quite reasonably, nay, necessary to go back and sort it out at fort fumble.
All Corporate management of TBC since 1995 bears responsibility for the damage imposed to the engineering reputation of that company, and should be held accountable, their actions damaged shareholder value and there was repeated evidence that they were heading into the weeds in their myopic management practices.
Join Date: Dec 2019
Location: OnScreen
Posts: 410
Likes: 0
Received 0 Likes
on
0 Posts
We may need to go back to the AOM and AD to confirm exactly what the exact wording was, however, the system logic, flawed as it was, would not be unrecoverable if the crews had been given enough information to counter a runaway trim in the first instance with a manual trim input to ensure that the trim returned to a normal range, before the STAB CUTOUT SW....... CUTOUT was selected. If the first action was to go to CUTOUT before putting the aircraft into an "in trim" condition, then, yes, it was possible that the speed, trim and residual elevator authority in conjunction with the minimal torque available by the manual trim wheel for a severe out of trim case, would result in a stabiliser that would overpower the manual trim system unless the elevators were unloaded, a technique which is pretty exciting for the passengers and pilots alike to see the world big in their windows. IIRC, the AD included comments related to being in trim, but did not at any time expand on the criticality of that action, and the industry awareness of the limitation of 60's accepted trim architecture that barely had a fully compliant backup in the absence of the knowledge related to being out of trim v manual trim torque constraints.... this was an ill considered document, and had potential to result in a bad outcome, blaming the ET302 pilot for merely being a pilot and not being Tex Watson is hardly the standard of excellence that the Old Boeing, pre contamination with MDD management, "The New Boeing" the one that sacked QA managers for doing their job, that managed systems that resulted in fasteners on the B787 being a different size on the east coast and west coast... (Coriolis?) who gave the MCAS, the KC46 debacles, VOL I and VOL II, and generally messed up a proud company... yeah.
We should not forget, a Parkinson alike trim usage would save the show, though that was left out of the Boeing documentation, maybe, because it was unknown, or just, because that would bring a significant change in the trim-computer to light and require the additional training ?
The weasel words applied did not make it easy for a crew confronted with a change that was very recent and which had not been trained or explained in depth to the flight crew.
The evidence is that of the 3 events, the one the sector prior to Lion Airs splash, and the ET302 one, where 2 of those beat the driver, and one driver set had the new instructions, it would seem to have been quite reasonably, nay, necessary to go back and sort it out at fort fumble.
The evidence is that of the 3 events, the one the sector prior to Lion Airs splash, and the ET302 one, where 2 of those beat the driver, and one driver set had the new instructions, it would seem to have been quite reasonably, nay, necessary to go back and sort it out at fort fumble.
All Corporate management of TBC since 1995 bears responsibility for the damage imposed to the engineering reputation of that company, and should be held accountable, their actions damaged shareholder value and there was repeated evidence that they were heading into the weeds in their myopic management practices.
Psychophysiological entity
Join Date: Jun 2001
Location: Tweet Rob_Benham Famous author. Well, slightly famous.
Age: 83
Posts: 3,171
Likes: 0
Received 0 Likes
on
0 Posts
Being an old retired guy, I had the time to read in on every post in PPRuNe's threads (plural). It was a long haul and would take a month . . . or two, to summarise.
Just a few memories to ponder, in no particular order:-
I found what might be the only mention of MCAS in a South American pilot's handbook. Four or five shortened lines on a right hand page. It was found by chance. When I posted it on PPRuNe folk looked at my link, but as I recall, no one had found another reference to MCAS world-wide. (as of back then)
The Ethiopian captain might well have been more affected by the chaos having had minimal and vague explanations of a mysterious system. One thing that would be burning into his brain would have been that the other aircraft crashed. This is not just an idle response to an above post, but something I pursued at length back then. I still doubt that any instruction in that 5 months was equivalent to part of a type rating written.
"it's as though STS is working in reverse". An odd quote - more indicative of the pilot's state of mind than reasoned systems analysis.
That nine second MCAS pitch down run.
After some considerable time, Sully's quote. "That could have claimed me."
My own self-opinionated original thoughts . . . slowly weighed down by the vivid descriptions of chaotic sights and sounds. Memories of how distracting 20 mins of stick-shaker had been for me. Just the stick-shaker, everything else spot on normal. Later, I was astonished at how it had soaked into my brain.
World wide lack of awareness about the Toronto 707 hand cranking - and how close it had been to disaster. And now the 47' horizontal stabilizer has to be cranked by a wheel with a smaller radius. This is not a linear burden.
Not our members of course, but an almost world-wide lack of understanding about losing the Pickle Switch function after switching the two switches that all good pilots would have switched - and doing it in a microsecond.
For weeks on Quora I posted much what I'd learned on PPRuNe. I had to be careful, for some hours there was just me, thousands of hits world wide. Some Boeing skippers let us know how American pilots would have done it. Soon everyone and their uncle was a Boeing MAX instructor. The point of all this is the confusion. I'd take an hour to write a few lines, yet still manage to confuse someone. Good reporting certainly deserved that Pulitzer Prize.
Just a few memories to ponder, in no particular order:-
I found what might be the only mention of MCAS in a South American pilot's handbook. Four or five shortened lines on a right hand page. It was found by chance. When I posted it on PPRuNe folk looked at my link, but as I recall, no one had found another reference to MCAS world-wide. (as of back then)
The Ethiopian captain might well have been more affected by the chaos having had minimal and vague explanations of a mysterious system. One thing that would be burning into his brain would have been that the other aircraft crashed. This is not just an idle response to an above post, but something I pursued at length back then. I still doubt that any instruction in that 5 months was equivalent to part of a type rating written.
"it's as though STS is working in reverse". An odd quote - more indicative of the pilot's state of mind than reasoned systems analysis.
That nine second MCAS pitch down run.
After some considerable time, Sully's quote. "That could have claimed me."
My own self-opinionated original thoughts . . . slowly weighed down by the vivid descriptions of chaotic sights and sounds. Memories of how distracting 20 mins of stick-shaker had been for me. Just the stick-shaker, everything else spot on normal. Later, I was astonished at how it had soaked into my brain.
World wide lack of awareness about the Toronto 707 hand cranking - and how close it had been to disaster. And now the 47' horizontal stabilizer has to be cranked by a wheel with a smaller radius. This is not a linear burden.
Not our members of course, but an almost world-wide lack of understanding about losing the Pickle Switch function after switching the two switches that all good pilots would have switched - and doing it in a microsecond.
For weeks on Quora I posted much what I'd learned on PPRuNe. I had to be careful, for some hours there was just me, thousands of hits world wide. Some Boeing skippers let us know how American pilots would have done it. Soon everyone and their uncle was a Boeing MAX instructor. The point of all this is the confusion. I'd take an hour to write a few lines, yet still manage to confuse someone. Good reporting certainly deserved that Pulitzer Prize.
What struck me at the time was that there wasn’t an immediate trigger for action in the way the fault presented itself. The 737 trim is active all the time during flight; in fact it is unusual for the trim wheels to *not* be in motion for any length of time. STS, MCAS, config changes, CofG changes, etc.
The Boeing checklist trigger for trim runaway at the time was “continuous uncommanded trim motion”, which guards against an electromechanical runaway, but that wasn’t what happened - it was a software failure that only moved the trim under certain circumstances. How could you tell the difference between, say, STS doing its job and and an MCAS failure? The answer is, in the short term you couldn't, and abnormal operation appeared the same as normal operation unless you had a long diagnosis period, by which time it was too late.
The Boeing checklist trigger for trim runaway at the time was “continuous uncommanded trim motion”, which guards against an electromechanical runaway, but that wasn’t what happened - it was a software failure that only moved the trim under certain circumstances. How could you tell the difference between, say, STS doing its job and and an MCAS failure? The answer is, in the short term you couldn't, and abnormal operation appeared the same as normal operation unless you had a long diagnosis period, by which time it was too late.
What struck me at the time was that there wasn’t an immediate trigger for action in the way the fault presented itself. The 737 trim is active all the time during flight; in fact it is unusual for the trim wheels to *not* be in motion for any length of time. STS, MCAS, config changes, CofG changes, etc.
The Boeing checklist trigger for trim runaway at the time was “continuous uncommanded trim motion”, which guards against an electromechanical runaway, but that wasn’t what happened - it was a software failure that only moved the trim under certain circumstances. How could you tell the difference between, say, STS doing its job and and an MCAS failure? The answer is, in the short term you couldn't, and abnormal operation appeared the same as normal operation unless you had a long diagnosis period, by which time it was too late.
The Boeing checklist trigger for trim runaway at the time was “continuous uncommanded trim motion”, which guards against an electromechanical runaway, but that wasn’t what happened - it was a software failure that only moved the trim under certain circumstances. How could you tell the difference between, say, STS doing its job and and an MCAS failure? The answer is, in the short term you couldn't, and abnormal operation appeared the same as normal operation unless you had a long diagnosis period, by which time it was too late.
How would the crew know there was an electromechanical failure? Do they rip the wiring apart looking for the short circuit before turning off the trim switches? How long is "continuous?" STS doesn't run at top speed for 30 solid seconds, which is more than enough to put 100 pounds on the wheel. Trim will stop at the upper or lower limits of travel, so by definition it cannot be "continuous." I had a recent electrical issue in my house - power would cut out and come back on - from a loose wire at the distribution transformer waving in the breeze and sometimes making a short circuit to ground. If a similar situation happened, intermittent, but interfering trim problem by wiring defect, say by chafing, or a loose bit of solder in a trim switch, would that also be a hands-up, cannot be solved situation?
The 737 trim is active all the time during flight; in fact it is unusual for the trim wheels to *not* be in motion for any length of time.
STS tries to ensure that the trim load is zero. This is why the Lion Air crew reported of MCAS "STS is running backwards" because it was adding to the trim load and not making it go away. The fact that an unexpected 10, 20, 30, 40, 50 ,60 pounds of trim load was on the wheel is enough to tell there is a trim problem and using the wheel trim switch countered the trim load occurred to the first Lion Air crew and the captain of the second Lion Air crew, who apparently thought using it was obvious enough he didn't mention it to the First Officer.
How would the crew know there was an electromechanical failure? Do they rip the wiring apart looking for the short circuit before turning off the trim switches? How long is "continuous?" STS doesn't run at top speed for 30 solid seconds, which is more than enough to put 100 pounds on the wheel. Trim will stop at the upper or lower limits of travel, so by definition it cannot be "continuous." I had a recent electrical issue in my house - power would cut out and come back on - from a loose wire at the distribution transformer waving in the breeze and sometimes making a short circuit to ground. If a similar situation happened, intermittent, but interfering trim problem by wiring defect, say by chafing, or a loose bit of solder in a trim switch, would that also be a hands-up, cannot be solved situation?
How would the crew know there was an electromechanical failure? Do they rip the wiring apart looking for the short circuit before turning off the trim switches? How long is "continuous?" STS doesn't run at top speed for 30 solid seconds, which is more than enough to put 100 pounds on the wheel. Trim will stop at the upper or lower limits of travel, so by definition it cannot be "continuous." I had a recent electrical issue in my house - power would cut out and come back on - from a loose wire at the distribution transformer waving in the breeze and sometimes making a short circuit to ground. If a similar situation happened, intermittent, but interfering trim problem by wiring defect, say by chafing, or a loose bit of solder in a trim switch, would that also be a hands-up, cannot be solved situation?
A (plausible) electromechanical failure would be when nothing you can do with the normal flight deck controls can stop the trim running in a particular direction, so swift intervention is necessary before it goes to the stops. On my current type (777) you get a warning as soon as the monitoring picks this up. If you disconnect the trim every time it moves automatically, you’d do it shortly after takeoff on every flight. There is no indication to the pilots as to whether it’s MCAS, STS or even the other pilot doing the trimming, apart from the speed, and that doesn’t really help much; an intermittent fault would, again, look like normal operation until it really showed its hand.
In a critical, high workload phase of flight, near the ground, experiencing something novel that doesn’t easily categorise and requires cognition and an accurate mental systems model (not present, through no fault of the pilots) to diagnose would confuse even experienced operators. That’s why we use rule-based behaviour for Time Critical Events, such as RTO, GPWS, Windshear and Trim Runaway, but these are triggered by specific criteria which are learnt and practiced by rote because there is not time for pontification. Sadly, I think the accident crews never really got beyond the startle/react phase as there were too many audible, tactile and mental distractions to allow much in the way of a diagnostic loop to develop.
Sadly, I think the accident crews never really got beyond the startle/react phase as there were too many audible, tactile and mental distractions to allow much in the way of a diagnostic loop to develop.
Since with Boeing it is all about the dollars, maybe they should have thought of the old adage "If you think paying for safety is expensive, try paying for the accident"
Join Date: Jul 2013
Location: Within AM radio broadcast range of downtown Chicago
Age: 71
Posts: 678
Received 0 Likes
on
0 Posts
STS tries to ensure that the trim load is zero. This is why the Lion Air crew reported of MCAS "STS is running backwards" because it was adding to the trim load and not making it go away. The fact that an unexpected 10, 20, 30, 40, 50 ,60 pounds of trim load was on the wheel is enough to tell there is a trim problem and using the wheel trim switch countered the trim load occurred to the first Lion Air crew and the captain of the second Lion Air crew, who apparently thought using it was obvious enough he didn't mention it to the First Officer.
Serious question, I'm not arguing that "it was obvious enough" was not the reason, rather asking whether it's a necessary inference?
Psychophysiological entity
Join Date: Jun 2001
Location: Tweet Rob_Benham Famous author. Well, slightly famous.
Age: 83
Posts: 3,171
Likes: 0
Received 0 Likes
on
0 Posts
Handing over to the FO to free up a bit of brain-load and he doesn't do the one thing - use the Pickle Switches - that might well have given the clue how to save the aircraft.
An important point. Would the captain be referring to the load while hand flying, or just the spin direction of the manual trim wheel? My highlight.
PPRuNe 2nd Nov 2018
The speed of course was a bit of electronic guesswork by then.
Originally Posted by MechEngr View Post
STS tries to ensure that the trim load is zero. This is why the Lion Air crew reported of MCAS "STS is running backwards" because it was adding to the trim load and not making it go away.
STS tries to ensure that the trim load is zero. This is why the Lion Air crew reported of MCAS "STS is running backwards" because it was adding to the trim load and not making it go away.
PPRuNe 2nd Nov 2018
. . . The purpose of the STS is to return the airplane to a trimmed speed by commanding the stabilizer in a direction opposite the speed change . . .
Loose rivets, OK - been reading a lot more.
The speed target for STS appears to be either a previously set speed or the speed the plane was going when a pilot last let off the trim switch - sounds like an action similar to cruise control in a car. Lock it in at a set speed, but if I trim that speed up or down, the cruise control uses that new speed. However, it's smarter in that one of the problems is needing to handle the undamped phugoid which it does by reacting more quickly than the natural oscillation of the plane.
From the 737 page Flight Controls :
Per B-737 Speed Trim System
From that thread - it was to solve the problem that at aft CG and high thrust there isn't enough trim reaction force to meet the minimum gradient of 3 pounds per 1 Degree AoA change. If the CG was at the Center of Pressure no stick force is required for any AoA change - hence this moves the trim opposite to the pilot input as the CG approaches that (hopefully unreached) condition. That is, if the pilot pulls back to slow the plane the STS supplies nose down trim to encourage the pilot to speed it back up.
---
It appears the effect of STS should have been to push the nose up as the plane accelerated and MCAS was pushing the nose down; the opposite. While STS doesn't move to relieve trim loads, it moves to reset the speed to where the trim load is zero unless the pilot is pulling or pushing.
The speed target for STS appears to be either a previously set speed or the speed the plane was going when a pilot last let off the trim switch - sounds like an action similar to cruise control in a car. Lock it in at a set speed, but if I trim that speed up or down, the cruise control uses that new speed. However, it's smarter in that one of the problems is needing to handle the undamped phugoid which it does by reacting more quickly than the natural oscillation of the plane.
From the 737 page Flight Controls :
Speed trim is applied to the stabilizer automatically at low speed, low weight, aft C of G and high thrust. Sometimes you may notice that the speed trim is trimming in the opposite direction to you, this is because the speed trim is trying to trim the stabilizer in the direction calculated to provide the pilot with positive speed stability characteristics. The speed trim system adjusts stick force so the pilot must provide significant amount of pull force to reduce airspeed or a significant amount of push force to increase airspeed. Whereas, pilots are typically trying to trim the stick force to zero. Occasionally these may be in opposition.
By the sounds of everything, the Cessna 172 behaves the same way: When you get off the trim speed, a stick force develops. The STS only increases this stick force because otherwise it's too weak to meet certification.
---
It appears the effect of STS should have been to push the nose up as the plane accelerated and MCAS was pushing the nose down; the opposite. While STS doesn't move to relieve trim loads, it moves to reset the speed to where the trim load is zero unless the pilot is pulling or pushing.
Join Date: Dec 2019
Location: OnScreen
Posts: 410
Likes: 0
Received 0 Likes
on
0 Posts
And with the stick-shaker shaking your teeth out, there is little muscle tension monitoring capacity left, to determine, whether the aircraft is out of trim, until the yoke forces get in the order of magnitude of the stick-shaker forces. This happens in seconds, so yeah, before you realize it, the yoke force gets immense and the whole beyond recovery.
Join Date: Dec 2019
Location: OnScreen
Posts: 410
Likes: 0
Received 0 Likes
on
0 Posts
I didn't say it did. Perhaps I need more words. Let me clarify to unwind your concern.
The effect expected by the pilot from STS was to push the nose up and, instead, MCAS pushed the nose down, appearing to the pilot that it was operating opposite which would have been reason for reporting it that way to maintenance.
The effect expected by the pilot from STS was to push the nose up and, instead, MCAS pushed the nose down, appearing to the pilot that it was operating opposite which would have been reason for reporting it that way to maintenance.
Psychophysiological entity
Join Date: Jun 2001
Location: Tweet Rob_Benham Famous author. Well, slightly famous.
Age: 83
Posts: 3,171
Likes: 0
Received 0 Likes
on
0 Posts
I went back to the 2nd Nov 18 and read ManaAdaSystem with more care. His paste is from an FCOM? Must be right, surely? Gasp! This is exactly the kind of answer chatCPT churns out when it runs out of specific knowledge. (chatCPT can become the wandering mind of infant artificial intelligence.)
The thread goes on with good blokes trying to make head or tail of it. I'll come back tonight when I've had a drink.
B-737 Speed Trim System
The thread goes on with good blokes trying to make head or tail of it. I'll come back tonight when I've had a drink.

B-737 Speed Trim System
Join Date: Jul 2003
Location: An Island Province
Posts: 1,240
Likes: 0
Received 0 Likes
on
0 Posts
"Systems are designed and constructed from components that are expected to fail.
As the complexity of a system increases, the accuracy of any single agent's (person's) own model of that system decreases rapidly."
A quote from a report on coping with complexity in IT malfunctions. Many similarities with operator and design issues as the Max, except for the timescales and number of people involved.
Other 'cherry picked' quotes; read the full report for context.
Surprise
In all cases, the participants experienced surprise. … mainly discoveries of previously unappreciated dependencies that generated the anomaly or obstructed its resolution or both. The fact that experts can be surprised in this way is evidence of systemic complexity and also of operational variety.
A common experience was "I didn't know that it worked this way." People are surprised when they find out that their own mental model of The System doesn't match the behavior of the system.
More rarely a surprise produces astonishment, a sense that the world has changed or is unrecognizable in an important way. This is sometimes called fundamental surprise … four characteristics of fundamental surprise that make it different from situational surprise:
1. situational surprise is compatible with previous beliefs about ‘how things work’; fundamental surprise refutes basic beliefs;
2. it is possible to anticipate situational surprise; fundamental surprise cannot be anticipated;
3. situational surprise can be averted by tuning warning systems; fundamental surprise challenges models that produced success in the past;
4. learning from situational surprise closes quickly; learning from fundamental surprise requires model revision and changes that reverberate.
This adjustment of the understanding of what the system was and how it worked was important to both immediate anomaly management and how post-anomaly system repairs add to the ongoing processes of change.
Uncertainty and escalating consequences combine to turn the operational setting into a pressure cooker and workshop participants agreed that such situations are stressful in ways that can promote significant risk taking.
Reread the surprise section with alternative viewpoints; operators were surprised, manufacturer, regulator, self; which types of surprise.
PPRuNe - surprise; a forum for ill considered post-mortems.
Experts are typically much better at solving problems than at describing accurately how problems are solved. Eliciting expertise usually depends on tracing how experts solve problems. … experts demonstrated their ability to use their incomplete, fragmented models of the system as starting points for exploration and to quickly revise and expand their models during the anomaly response in order to understand the anomaly and develop and assess possible solutions.
… focused on hypothesis generation. [ not seeking to follow SOPs existent or not ] These efforts were sweeping looks across the environment looking for cues. This behavior is consistent with recognition primed decision making.
… organizations which design systems... are constrained to produce designs which are copies of the communication structures of these organizations.
The alerts draw attention but they are usually not in themselves, diagnostic. Instead, alerts trigger a complex process of exploration and investigation that allows the responders to build a provisional understanding of the source(s) of the anomalous behavior that generated the alert.
It is unanticipated problems that tend to be the most vexing and difficult to manage.… unappreciated, subtle interactions between tenuously connected, distant parts of the system.
Don't overlook the end sections; how much dark debt is the industry carrying. An ever increasing amount due to automation and operational complexity, yet constant limited human performance.
"dark debt"; vulnerability was not recognized or recognizable until the anomaly revealed it. … found in complex systems and the anomalies it generates are complex system failures
Dark debt is not recognizable at the time of creation. … it is a product of complexity, adding complexity is unavoidable as systems change.
Ref https://snafucatchers.github.io
,
As the complexity of a system increases, the accuracy of any single agent's (person's) own model of that system decreases rapidly."
A quote from a report on coping with complexity in IT malfunctions. Many similarities with operator and design issues as the Max, except for the timescales and number of people involved.
Other 'cherry picked' quotes; read the full report for context.
- Each anomaly arose from unanticipated, unappreciated interactions between system components.
- There was no 'root' cause. Instead, the anomalies arose from multiple latent factors that combined to generate a vulnerability.
- The vulnerabilities themselves were present for weeks or months before they played a part in the evolution of an anomaly.
- The events involved both external software/hardware
- The vulnerabilities were activated by specific events, conditions, or situations.
- The activators were minor events, near-nominal operating conditions, or only slightly off-normal situations.
Surprise
In all cases, the participants experienced surprise. … mainly discoveries of previously unappreciated dependencies that generated the anomaly or obstructed its resolution or both. The fact that experts can be surprised in this way is evidence of systemic complexity and also of operational variety.
A common experience was "I didn't know that it worked this way." People are surprised when they find out that their own mental model of The System doesn't match the behavior of the system.
More rarely a surprise produces astonishment, a sense that the world has changed or is unrecognizable in an important way. This is sometimes called fundamental surprise … four characteristics of fundamental surprise that make it different from situational surprise:
1. situational surprise is compatible with previous beliefs about ‘how things work’; fundamental surprise refutes basic beliefs;
2. it is possible to anticipate situational surprise; fundamental surprise cannot be anticipated;
3. situational surprise can be averted by tuning warning systems; fundamental surprise challenges models that produced success in the past;
4. learning from situational surprise closes quickly; learning from fundamental surprise requires model revision and changes that reverberate.
This adjustment of the understanding of what the system was and how it worked was important to both immediate anomaly management and how post-anomaly system repairs add to the ongoing processes of change.
Uncertainty and escalating consequences combine to turn the operational setting into a pressure cooker and workshop participants agreed that such situations are stressful in ways that can promote significant risk taking.
Reread the surprise section with alternative viewpoints; operators were surprised, manufacturer, regulator, self; which types of surprise.
PPRuNe - surprise; a forum for ill considered post-mortems.
Experts are typically much better at solving problems than at describing accurately how problems are solved. Eliciting expertise usually depends on tracing how experts solve problems. … experts demonstrated their ability to use their incomplete, fragmented models of the system as starting points for exploration and to quickly revise and expand their models during the anomaly response in order to understand the anomaly and develop and assess possible solutions.
… focused on hypothesis generation. [ not seeking to follow SOPs existent or not ] These efforts were sweeping looks across the environment looking for cues. This behavior is consistent with recognition primed decision making.
… organizations which design systems... are constrained to produce designs which are copies of the communication structures of these organizations.
The alerts draw attention but they are usually not in themselves, diagnostic. Instead, alerts trigger a complex process of exploration and investigation that allows the responders to build a provisional understanding of the source(s) of the anomalous behavior that generated the alert.
It is unanticipated problems that tend to be the most vexing and difficult to manage.… unappreciated, subtle interactions between tenuously connected, distant parts of the system.
Don't overlook the end sections; how much dark debt is the industry carrying. An ever increasing amount due to automation and operational complexity, yet constant limited human performance.
"dark debt"; vulnerability was not recognized or recognizable until the anomaly revealed it. … found in complex systems and the anomalies it generates are complex system failures
Dark debt is not recognizable at the time of creation. … it is a product of complexity, adding complexity is unavoidable as systems change.
Ref https://snafucatchers.github.io
,
Last edited by alf5071h; 27th Mar 2023 at 16:42.