PPRuNe Forums - View Single Post - Are we facing a safety issue?
View Single Post
Old 25th Oct 2009, 20:07
  #86 (permalink)  
PJ2
 
Join Date: Mar 2003
Location: BC
Age: 76
Posts: 2,484
Received 0 Likes on 0 Posts
lomapaseo;
so how do we address this?
Organizations which have not had a major accident with which to deal are prone to an unusual arrogance with regard to examining information which is contrary to their collective self-image or "culture".

Curiously, these are the airlines which are most at risk because production-driven managers see their priorities and their behaviours as "successful" and see no reason to change because they have already "learnt".

NASA
Despite clear precursors in their operational data, NASA continued to operate the shuttle with ongoing O-ring anomalies and foam-shedding. Both precursors had signficant and numerous data-points throughout years of operations. Such anomalies were "normalized" into each launch, to the point where serious questions could not be asked of the engineering people without concurrent heavy resistance from a "can-do" mentality.

The key to understanding this is the question always asked by "production-minded" managers, "so if you think we're not safe to operate, where are the accidents to prove your point?" In other words, precursors to accidents are not taken as "real", nor are near misses. The notion of "luck" and "skill" are often invoked in such a mentality.

The Concorde accident was the result of the same kind of thinking. There were many, many data points indicating that shrapnel from tires shredding retread material could and in some cases did, damage the wings and compromise the fuel tanks. A flight out of (I believe) Washington DC actually caught fire when the tank was holed by shrapnel but the fire did not continue.

To cite an example much closer to the homes of most airlines, the Australian Accident Board's report on QANTAS' B-744 overrun at Bangkok stated quite succinctly that the overrun accident was in their flight data in the years leading up to the overrun. Reduced flap settings and idle reverse plus high speeds over the fence and long landings were all in the data but nothing was done to address these issues and so they continued to build data-points. The accident occurred when, as always, several other issues coincided in combination with these factors. The accident was "preventable" (by changing the approach SOPs) but the data cannot tell an airline "which airplane, which day?"

I know another airline in which this very same phenomenon is occurring even as I write but the holes haven't lined up yet; but they will. The data is telling them something, in fact a number of things, that production-minded managers have so far ignored because they have a habit of "explaining away" the outliers and reflecting on and otherwise being satisfied with, an admirable, relatively accident-free history.

Your question may be posed on many levels and thus any tentative response must be understood on these different levels, to be effective and implementable.

First, we must recognize that we are seeing fatal accidents of a different nature today than just twenty years ago, a very short time in this business. Twenty years ago, mid-air collision, CFIT and a few mode-confusion causes were seen. Off the top of my head, we have at least a dozen, possible thirteen major loss-of-control accidents in the last eight or nine years* in which training, competency, skill or situational awareness are a signficant part of the discussion on causes of these accidents.

We may examine a single "event" in the data and pose one response which may resolve that one issue, once, and it may re-occur because it is not "learned from". We may move to a much broader view and discuss how deregulation of the airlines has provided fertile ground for the emergence of serious but thus-far latent systemic safety issues. Such discussions can't resolve individual problems but can point to revisions of a system which preclude the worst of individual events.

It is important to be able to move within these levels but I wonder if most airline CEOs are capable of such thinking these days with production pressures so high? In such cases we must turn to other areas in which to work and otherwise evoke a response that suitably addresses the precursors of an accident without instilling an irrationality to the operation.

We obviously can't "re-regulate" and expect to solve the problem at that very high level of systemic cause. What's next?

Next level "down" might be defined in terms of SMS. If the regulator continues or returns to a level of oversight such that airlines will not or cannot get away with cutting corners by privileging commercial priorities over safety priorities, then SMS will survive as a very effective data-driven safety system just as it was originally conceived.

What is next follows on SMS. Data analysis by competent, experienced individuals including pilots and not just amateur interns who are paid next to nothing, is the least harmful, least expensive and most benign response possible: I think looking at and understanding the data in terms of "precursors" is where the solution and the answer to your question lies. The difficulty is, being shown information that "all is not well" interferes with egos and the self-image of the airline as a completely safe operation, and therefore intereferes with the feedback process which is essential to "learning lessons". How do we address this?

We accomplish this by using the lessons learnt by NASA as a result of the Challenger and Columbia accidents and which are extremely well documented and openly available in just two very good books, "The Challenger Launch Decision" by Diane Vaughan, and "Organization at the Limit" by William Starbuck and Moshe Farjoun.

We first examine the theme of these easy lessons then we examine how such lessons may be tailored to our own operation. Then we take it to the practical level, using materials which reflect our own operations "at the street level", turning theoretical work into practical approaches to daily safety issues. Culture is informed by the data, and the value of data is enhanced and valued within the culture.

The questions asked are the most important. Bill Starbuck's paper, "Fine-tuning the Odds Until Something Breaks" is also well worth reading but many works by for example Sidney Dekker, now address these issues in a much more practical way for those airlines that are serious about knowing and learning.

This is what I meant by "what", not "who" in the discussion above. Where "learning the lesson" fails and arrogance wins, we will always find that what took the organization off the right course was "who", never "what".

If an airline either does not use, look at or respond to their data (or doesn't even do data collection except to tick the box), then there is effectively nothing to be done. Organizations cannot be "forced" to be safe, especially when they can point to their record of "no accidents", or "explanable accidents".

Data analysis works because recognition is the first step to addressing a problem if there is one. Then avoiding denial is the next step followed by defining the problem using experienced people examining the data and addressing the SOP or the procedure then publishing the results in their ops manals.

Of course, as we have discussed here and elsewhere, this assumes an airline is first of all aware of these issues (that it may be 'at risk' even though they dont' want to think so) and that they wish to do something about a problem which is showing up in the data.

One can work "inside", with similar-minded people who also have the power and courage to do the right thing. That happens once in a while and we shouldn't be too idealistic - it works and is sometimes needed.

Short of reasonably-expected intelligent responses or working with those outside the organization but who have the power to enact change, the alternative frankly, is "heroic action" - the kind of direct action that, for example, Greenpeace, Amnesty International and thousands of other groups engage in. Whistle-blowing actions for example are both short-term solutions and far riskier in terms of achieving the originally-conceived goals. In other words, unless an organization is cognizant of all aspects of its operation and not just the commercial, profit-making aspects, sooner or later, in a relatively high-risk operation such as aviation, the outcomes are well understood.

If the airline isn't doing any of this on a daily basis then nothing has been learnt and the answer to your question is, there is nothing to be done except to wait; the airline is on it's way to learning from an accident. Just ask NASA.


* Gulfair 320, Armavia Air 320, TAM 320, Spanair MD83, Turkish B737, Colgan Q400, One-Two-Go MD82, Garuda B737, Comair CRJ, Pinnacle CRJ, Helios B737, Adam Air B737, possibly AF447...

Last edited by PJ2; 25th Oct 2009 at 21:17.
PJ2 is offline