Go Back  PPRuNe Forums > Misc. Forums > Passengers & SLF (Self Loading Freight)
Reload this Page >

BA delays at LHR - Computer issue

Wikiposts
Search
Passengers & SLF (Self Loading Freight) If you are regularly a passenger on any airline then why not post your questions here?

BA delays at LHR - Computer issue

Thread Tools
 
Search this Thread
 
Old 27th May 2017, 17:30
  #41 (permalink)  
 
Join Date: Jun 2009
Location: Canada
Posts: 464
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by lamer
Cost of parallel backup system: €40 million just to set up.
And useless if you don't regularly test switchovers to the backup system to make sure it works. It's always fun when the online server fails, the system automatically switches to the backup... and it crashes.

As people have said, a lot of this is also probably due to having to build around legacy systems. If you started from scratch, you could build in the redundancy required and ensure it is thoroughly tested from the ground up. It's much harder to bolt that on to an old system that has to be operational 24/7.
MG23 is offline  
Old 27th May 2017, 17:36
  #42 (permalink)  
 
Join Date: May 2008
Location: uk
Age: 77
Posts: 5
Likes: 0
Received 0 Likes on 0 Posts
I was swallowed up into Big Airways nearly 30 years ago, from a certain `Scottish airline`. Even back then I was appalled at the standard of BA`s IT. It was awful, and nothing has changed, it has always been awful over the intervening years. They have, and have always have had, an arrogant `we know best` approach to the real world. Something like this does not surprise me one little bit.
roquefort01 is offline  
Old 27th May 2017, 17:49
  #43 (permalink)  
 
Join Date: Jul 2014
Location: England
Posts: 400
Received 1 Like on 1 Post
Totally agree with PAXboy at #40 and others who have pointed out that this is likely to be, at its source, a management failure.

If your IT system is mission-critical, then you have to do what it takes to make failure a real improbability (as in 0.0001%). Bean-counters are very likely to balk at the expense until they have an expensive failure. Even then they may make a trade-off: if the expense of the failure is significantly less than the expense of a resilient system, then we can live with a failure now and then ... Having said that, even people with mission-critical systems which one would expect to have full resilience do have failures now and then (ATC comes to mind).

In theory one should be able to fall back to manual operation (if you have manual procedures available, and you've tested them, and your people have not only been trained with them but have actually used them in real life in the recent past). But it may be that we're now so dependent on interlinked IT systems that manual operation is no longer practical.
OldLurker is offline  
Old 27th May 2017, 17:57
  #44 (permalink)  
Paid...Persona Grata
 
Join Date: Aug 2004
Location: Between BHX and EMA
Age: 78
Posts: 240
Received 4 Likes on 3 Posts
Cost of parallel backup system: €40 million just to set up.
And how much is this incident going to cost? Lost revenue and compensation, to say nothing of goodwill and future sales. I reckon £40 million would probably have been cheap.
UniFoxOs is offline  
Old 27th May 2017, 17:57
  #45 (permalink)  
 
Join Date: Jan 2008
Location: Reading, UK
Posts: 15,816
Received 201 Likes on 93 Posts
Originally Posted by OldLurker
In theory one should be able to fall back to manual operation (if you have manual procedures available, and you've tested them, and your people have not only been trained with them but have actually used them in real life in the recent past). But it may be that we're now so dependent on interlinked IT systems that manual operation is no longer practical.
Yes, the notion that you can run an airline, or indeed any operation of a comparable size, with pen, paper and sticky labels is a fantasy.
DaveReidUK is online now  
Old 27th May 2017, 18:06
  #46 (permalink)  
 
Join Date: Apr 2008
Location: iom
Posts: 112
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by crewmeal
So does this mess affect City Flyer's operations in the regions, or is this just London Airways?
CityFlyer's flights appear to be operating relatively normally today.
jijpc is offline  
Old 27th May 2017, 18:13
  #47 (permalink)  
 
Join Date: Aug 2001
Location: se england
Posts: 1,579
Likes: 0
Received 48 Likes on 21 Posts
In todays world you cannot easily have 'manual reversion ' and certainly not to pen and paper.

I have been in telecoms for 35 odd years and back in the 60s AT&T/Bell telephone realised that if direct dialling 'broke' every woman (no sexism then) in the USA would have to be a telephone operator (I realsie if you are under 40 odd you ahve no idea about that)

In these cases of massive complexity you have a number of strategic issues you must deal with
1 You cannot skimp on diversity and redundancy cost money
2 You cannot keep bodging systems to work together -you will never be rid of all add on s but you can ensure a stable core through redundancy of computers and interconnecting telcoms networks. Cost a lot of money.
3 you have to maintain a substantial core of staff who ar expert and experienced on your systems because they are all different in some way. Outsourcing and working to SLA for any complex activity is the road to disaster
4 Someone at board level has to understand all this and have a lot of influence to stop the spread of 'can't you do it cheaper' prevalent in UK companies compared to can't you do it better.
5 You must proactively hunt down problems as referred to earlier with Netflix , you need skilled staff to do this safely and securely and they are not cheap
6. If you look at Google . Facebook Amazon etc they have massive global networks with 'mirrors all over the world so there is no single point of failure. Everything on one mega server/data centre is backed up on another on in a different country or continent. This costs a lot of money but because of their cash position and stock values they can afford it
7 Recognise that IT for an airline is at the heart of everything. marketing is based on your customer data, sales are transacted through it. the passenger side of the system feeds the ops side with pax count , baggage weight and numbers that all affects fuel and weight and balance etc etc etc

So it is not just important it is at the heart of the operation but I bet the CFO and head of sales and Marketing earn more than the head of IT and that CEO has not got a clue about the issue.

Incidentally where is he -talk about leading from the bunkers
pax britanica is offline  
Old 27th May 2017, 18:29
  #48 (permalink)  
 
Join Date: Mar 2015
Location: UK
Posts: 109
Received 3 Likes on 1 Post
Psalms ....

From Psalms, chapter 7, verse 11:

Blessed are the paranoid, for they test their backups.
paperHanger is offline  
Old 27th May 2017, 18:31
  #49 (permalink)  
 
Join Date: Oct 2002
Location: London UK
Posts: 7,651
Likes: 0
Received 18 Likes on 15 Posts
Alex Cruz's crusade to minimise the costs of the operation and beat the LCCs at their game reaches yet another a new level, although I have to say, looking at two bookings I have for London to Dublin and back next week, it has in no way reached down to the ticket pricing.

I wonder when Willie will finally suss him.
WHBM is offline  
Old 27th May 2017, 18:37
  #50 (permalink)  
Paxing All Over The World
 
Join Date: May 2001
Location: Hertfordshire, UK.
Age: 67
Posts: 10,146
Received 62 Likes on 50 Posts
Both pax brit and paperHanger - I concur, it is what IT have been dealing with for decades. When I was working for a UK national high street chain in the 1990s, when the Chief Financial Officer walked to his desk in the morning - he had a print out of what had happened (financially) in nearly 1,000 shops the day before. He also knew all the foreign exchange transaction that the company buyers had made. YET, he prevented the Director of IT from reporting directly to the CEO and, effectively, our Director was demoted to report to the CFO. That sent a message throughout the IT department that the Cheif Exec did not understand us or value us. I saw the company make so many stupid mistakes and, eventually, waste so much money. Their shareholders never knew.

I just checked various news points and the CEO is not reported anywhere. You would have thought after the recent problems in the USA that CEO's would have learnt to get in off the golf course a little faster. OK, he might be stuck waiting for a flight in HKG but he could jump into any TV news studio.
PAXboy is offline  
Old 27th May 2017, 18:43
  #51 (permalink)  
 
Join Date: Dec 2006
Location: Florida and wherever my laptop is
Posts: 1,350
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by PAXboy
Ian W
There will be more of these complete machine failures when the Directors of the Board pay the money!

I was in telecommunications and IT for 27 years, including mission critical stuff for banking. After the recession of 90/91, it was all about saving money. One small example: A friend of mine who works for a small software company still get faced by the Boss telling the staff to put their development plans on hold as he just sold sone new feature to a customer. A feature that had been dropped from the development plan for a good reason. That is within the last month.

On this scale of events: as has been said above, there are too many systems, many of them legacy that do not dovetail well together. irrespective of the outsoucing problem (and it is a problem) this level of complexity, to provide ever more features and services - will fail.
I think we are both right.
It is cheaper to use IT 'professionals' who have only ever seen Java and its variants and they are then forced to try to work with legacy systems written in one of the vogue languages from the 80's or in assembler with FORTRAN maths libraries. There will be one or two 'skilled' programmers who will dig in and use clever features of compilers to write glue code (in stiffwareTM) to hold these systems together. These programmers will then move on leaving no documentation behind them and nobody will know how these kludged together systems actually work or how to fix them if some new added feature causes them to fail.
This is all due to a lack of quality control and pressure from unthinking management on IT staff that do not have the authority to tell the management 'NO it will not work'.
Ian W is offline  
Old 27th May 2017, 18:44
  #52 (permalink)  
 
Join Date: Jun 2001
Location: Rockytop, Tennessee, USA
Posts: 5,898
Likes: 0
Received 1 Like on 1 Post
Originally Posted by PAXboy
I just checked various news points and the CEO is not reported anywhere. You would have thought after the recent problems in the USA that CEO's would have learnt to get in off the golf course a little faster. OK, he might be stuck waiting for a flight in HKG but he could jump into any TV news studio.
He did make an appearance on Twitter:

https://twitter.com/British_Airways/...20211976212480
Airbubba is offline  
Old 27th May 2017, 18:47
  #53 (permalink)  
 
Join Date: May 2011
Location: Hampshire
Age: 76
Posts: 821
Likes: 0
Received 0 Likes on 0 Posts
The CEO has spoken. He says it is all down to a power supply failure.
I was surprised he had a degree in engineering. I had supposed his degree may haven been in kidology!
If he is not out the door by Tuesday, then Willie Walsh should be considering his own position.
KelvinD is offline  
Old 27th May 2017, 18:50
  #54 (permalink)  
 
Join Date: Jan 2012
Location: Below transition level
Posts: 364
Received 2 Likes on 2 Posts
Originally Posted by DaveReidUK
Yes, the notion that you can run an airline, or indeed any operation of a comparable size, with pen, paper and sticky labels is a fantasy.
Indeed, all the clowns suggesting you can run a modern airline with paper, pens and post-its should consider the computational complexity contained with in the Boeing or Airbus fleet that is actually moving the pax around the globe.

This is a management failure, nothing more, nothing less.
Fostex is offline  
Old 27th May 2017, 18:55
  #55 (permalink)  
 
Join Date: Aug 2007
Location: Ireland
Posts: 216
Likes: 0
Received 0 Likes on 0 Posts
Computer based systems will go down. anyone that tells you differerently is lying. How fast they come back depends on how much you have invested in paralellity, it's testing and/or availability of critical human support and spares.
They who demand max downtime guarantees in the 100'th of a percent are just asking to be lead up the garden path. Every airline should therefore have a manual minimum service alternative, or max one that is only dependent on local computing power. That be standalone tablet, pc or mobile. Sample; With passenger lists for the day stored locally at start of each day, only last minute changers would be denied boarding, reducing customers displaced. And this goes for flight ops as well as ground ops, now when so much has to communicated electronically same day to agencies like Eurocontrol.

BA (and Amadeus) seems to not have been willing to make a clean break from it's "mainframe" based systems maybe because that would have lead to guaranteed downtime. This has lead to a patchwork that is not set to diminish over time. Sometimes its better to make a clean break with the past. And even Amadeus now have alternatives. The problem is finding somebody with the stomach and power to schedule, and willingnes to take the flak for, the temporary disturbances it will guaranteed lead to. And when the worst happens, at some stage it will take to long to recover the mess of today. So better to reposition, clean out and create a fresh start for tomorrow. Sad for todays customers, but at least tomorrows won't be affected.
vikingivesterled is offline  
Old 27th May 2017, 18:56
  #56 (permalink)  
 
Join Date: Dec 2006
Location: Florida and wherever my laptop is
Posts: 1,350
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by KelvinD
The CEO has spoken. He says it is all down to a power supply failure.
I was surprised he had a degree in engineering. I had supposed his degree may haven been in kidology!
If he is not out the door by Tuesday, then Willie Walsh should be considering his own position.
I shall translate:

"He says it is all down to a power supply failure."

really means:

We designed the system so that one power supply failing would bring down our entire operations worldwide.

Therefore
  1. Primary reason is poor hardware system design
  2. Secondary reason: lack of or poor system acceptance testing

In this BA seems to be following the example of Delta whose backup system was powered through the same cutover switch as the primary system. So cutover switch became the single point of failure - and it did.

Rule one of fault tolerance: Single points of failure ALWAYS fail.
Ian W is offline  
Old 27th May 2017, 18:57
  #57 (permalink)  
Paxing All Over The World
 
Join Date: May 2001
Location: Hertfordshire, UK.
Age: 67
Posts: 10,146
Received 62 Likes on 50 Posts
It does not matter where the problem started - the problem was that it ran out of control. Management have only themselves to blame. As I said earlier, I'm sure that a lot of current (and ex) IT folks from BA have warned about this.
PAXboy is offline  
Old 27th May 2017, 18:59
  #58 (permalink)  
 
Join Date: Jun 2002
Location: Back on The Island.
Posts: 480
Received 0 Likes on 0 Posts
Fostex... music to my ears, it's about time 'managers' were exposed for what they really are. I'm glad that I'm out of all that yuk-speak.
zed3 is online now  
Old 27th May 2017, 19:00
  #59 (permalink)  
 
Join Date: Aug 2001
Location: se england
Posts: 1,579
Likes: 0
Received 48 Likes on 21 Posts
Ian W
I absolutely agree with you. These systems are very very complex and you need experienced skilled people to manage them. Sure you can outsource maintenance or terminals and straightforward stuff but the real airline specific and core bits need a lot of ability and corporate memory to manage . Lose those folks and you are in big trouble.
As to it being a power supply issue that is unforgivable since any mission critical equipment or equipment location on a system of this scale should have at least two wholly discrete 'mains' feeds plus a no break system to allow for local diesel generator back up and maybe even batteries for the telecoms elements. No high tech hidden glitches there but they do have to be tested and checked all the time to make sure that if the evil day comes...
pax britanica is offline  
Old 27th May 2017, 19:04
  #60 (permalink)  
 
Join Date: Aug 2001
Location: se england
Posts: 1,579
Likes: 0
Received 48 Likes on 21 Posts
I think the CEO should have appeared in person if he is in the Uk, bit of a bad joke using twitter to apologise for something like this. And I agree with Dave Reid - he shouldnt be CEO after the weekend.
pax britanica is offline  


Contact Us - Archive - Advertising - Cookie Policy - Privacy Statement - Terms of Service

Copyright © 2024 MH Sub I, LLC dba Internet Brands. All rights reserved. Use of this site indicates your consent to the Terms of Use.