PPRuNe Forums - View Single Post - BA delays at LHR - Computer issue
View Single Post
Old 28th May 2017, 06:33
  #127 (permalink)  
Nialler
 
Join Date: May 2008
Location: Paris
Age: 60
Posts: 101
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by ImageGear
and this is totally unacceptable as BA have found to their severe cost. Whoever took the decision to implement a system where recovery is to IPL and blow the reservations system out of the water with no failover needs to take the long walk.

To call the latest crop of "big servers", mainframes, is blatant misrepresentation. True fault tolerant mainframes no longer really exist.

30 Years ago, I ran a demonstration in Paris of multi-host file sharing across duplicate mainframes, where it was proved that on catastrophically failing one mainframe, the end result was not one transaction being lost anywhere between the reservations terminal and the failing computer memory module. The impact on the reservations clerk was a "wait" screen that lasted around 40 seconds. The solution sold, but not to BA.

There is absolutely no excuse, 30 years later, for any airline reservations department to accept anything less.

Some very big heads must roll.

Imagegear.
You misunderstood me. My entire career has been working on mainframes. The type of fail over capabilities you describe are not quite thirty years old, but you are correct that not simply sysplex, but geographically dispersed sysplex is possible. Not just possible but standard. My clients all have datacentres at sites which h are remote from each other and are cycled on a scheduled basis. OK, I'm an old it head, but I'm sure that pilots will Nod at the concept of double or triple redundancy. Their lives depend on it. In my case it is just my job.

My fear always is that a single system failure might not be restricted or contained when it is a logical or intrinsic programmer error which with the cold logic of object code propagates through the redundant systems also. The problem in your primary hydraulic system is not actually isolated because the same problem which led to its failure exists on the fallback.
Nialler is offline