PPRuNe Forums

PPRuNe Forums (https://www.pprune.org/)
-   Passengers & SLF (Self Loading Freight) (https://www.pprune.org/passengers-slf-self-loading-freight-61/)
-   -   BA delays at LHR - Computer issue (https://www.pprune.org/passengers-slf-self-loading-freight/595169-ba-delays-lhr-computer-issue.html)

Caribbean Boy 30th May 2017 22:15


Originally Posted by Tay Cough (Post 9787706)
Wasn't Comet House pulled down years ago?

Yes, and Speedbird House (next to TBA if memory serves me right) in 1998.

Tay Cough 30th May 2017 22:27

Maybe that's the problem. :rolleyes:

PanzerJohn 30th May 2017 22:53

I'm not sure if this is of any use..https://books.google.co.uk/books?id=...irways&f=false

G-CPTN 30th May 2017 23:15


Originally Posted by Ian W (Post 9787673)
In a machine room from my past, that no longer exists, a security guard groped for the light switch and powered down the research data center using the emergency switch despite its switch cover - rather obviously the lights did not come on although several smaller ones went off :eek:. So the guard groped further and found the light switch and realized that the wrong switch had been used so switched it back on again and went on his way not realizing the chaos caused by the few minutes interruption and uncontrolled restoration. :D

This sounds like a similar occurrence.

Way back in the mid 1980s, I was working for a small engineering manufacturing company that ran its daily systems on Apple II computers using Visicalc (which took 20 minutes to load and a further 20 minutes to save work).
There was no UPS, just straight off the mains.

An electrician had been called in to do some work on lighting and he systematically pulled the fuses to see which circuit the faulty lighting was on.

When he was challenged about his action which had 'crashed' all the computers, he replied that he had immediately replaced the fuse "so it couldn't have had any effect".

He obviously had no comprehension of how computers worked (and didn't work).

Each computer operator lost all the work that they had input in that session.

Self Loading Freight 31st May 2017 00:33

So power failed at a UPS - wrong switch hit, or whatever. Which happpens, and UPSs fail anyway.

Thereafter, though: I know that bringing servers up all at once can lead to mayhem, but 'serious physical damage'? It also sounds very much as if failover to the other site was badly broken - obviously it was - but it's hard to make sense of the D Tel's narrative around that point.

I guess the main press attention will move on, and BA will be hoping that nobody will look too closely at whatever details are made public outside the company. But if the truth isn't a toxic mix of bad management, bad design, bad testing and bad ops, I'll eat a ZX Spectrum.

p.j.m 31st May 2017 03:42


Originally Posted by Self Loading Freight (Post 9787829)
I know that bringing servers up all at once can lead to mayhem, but 'serious physical damage'?

When really old servers are physically powered off, there is a good chance that they will not start up again, capacitors etc which have just been "hanging on" can't handle power off/power on surge and fail.


a ZX Spectrum.
- maybe that's what they were running :)

Freehills 31st May 2017 03:49


Originally Posted by ACW599 (Post 9787574)
>If this is the calibre of senior management at BA then they deserve to fail.<

Does anyone nowadays doubt that the Peter Principle has been proved beyond reasonable doubt to be true? I can't off-hand think of any large company that has enlightened and capable management -- I certainly never came across any in my consulting days -- and it isn't necessary to look very closely for it to be horribly obvious that none of the emperors have any clothes.

Perhaps an even bigger mystery is why society at large allows the situation to persist.

Pretty much by definition anyone who ends up running a large company is an ambitious sociopath, because if you are not, you will not get there. Two of the largest firms in the world (Amazon and Apple) were built by deeply unpleasant people (Bezos & Jobs). Society at large lets it persist because - "they get things done" and, why would anyone normal want the role anyway?

Sunfish 31st May 2017 04:40

In my previous life, I was at one point Group General Manager: System Integration, for an outsourcing service provider and I have a few things to contribute to this discussion.

The first is that, if efficient, flexible, custom built, IT systems are central to your competitive advantage, then the IT manager needs to be a direct report to the CEO, not to some bean counter one level down. This is because your IT capabilities will limit your available business strategies unless there is constant change to your IT strategy to match the business needs. That means that the CEO and chief IT manager have to work very closely. For BA - any airline in fact, IT systems are central to your competitive advantage This is unarguable.

As an outsourcer, we always do our best to "lobotomise" our customers - remove what IT expertise. they have in order to make them reliant on us. We did this by hiring their best and brightest IT staff and arranging for the firing of the rest that we don't want or who represent a threat to our dominance of the IT agenda.

Once we have lobotomised the client, since we have access to the business strategy (the IT systems must match the strategy, right?) we tailor our offerings to direct the business strategy down avenues that maximise our profits. This includes change for changes sake that generate the need for additional services such as business process re engineering, change management and training services, etc. etc.

We also cut off threats to our dominance of the IT agenda by killing changes we don't like. We use politics, fear and cost tactics to do this.

Usually after about Five years of this, the client realises they have been lobotomised, are spending big money on IT and have their strategic directions "boxed in" by the outsourcer - meaning any changes are going to cost millions. They then realise they need to bring high level IT talent back in house to regain control of the IT agenda and pry our hands, finger by finger from the steering wheel of the company. Naturally we resist this any and every way we can. Please note, we are not good parasites, we don't care if we kill our host when he rejects us.

It is now quite clear that the Board of IAG and the management of BA failed. At Board level by not understanding the risks they were running by not insisting that IT be considered a top management responsibility. At management level, by not understanding the pitfalls of outsourcing and effectively managing their service providers who will have been doing precisely what I spelled out above to BA management.. Let me be clear. This is Board and senior management failure on a simply unpardonable level.

Since my view of the quality of British "management" is unprintable, let me speculate on what will happen next, as some folk here have stated, the innocent will be punished, the guilty will be promoted and management will go on its merry way, having learned precisely nothing.

On a technical note, the concept of an 'unforeseen equipment failure" as the root cause of the problem is risible. bollocks and BS.

By definition a backup system, or systems, properly designed and built, is there for the specific purpose of handling said "unforeseen equipment failure", at the worst possible time, under the worst possible conditions of staffing levels, weather, etc. or it isn't a backup system at all, it is merely some spare machines that might be able to take some of the load if suitably coddled.

To put that another way, the poster, an IT professional who described his method of testing a backup system by randomly pulling plugs and flicking switches knew exactly what he was talking about.

underfire 31st May 2017 05:35

In reality, systems and backup systems are designed to protect for reduced power (brownouts) and lack of power (blackouts) and power spikes, such as lightening strikes...BUT power surges, which may last several minutes, overloading primary, and depending on duration, backup systems, are obviously a different issue.

While a loss of power or power reduction is protected within backup systems, sustained power surges are not typically protected, as a redirect of power is difficult.

It will be interesting to see, while the BA issue was newsworthy, what other systems were affected, if any.

crewmeal 31st May 2017 05:49

Doubts now circulating regarding the 'truth' about the weekend's fiasco. Was it a power outage? What's BA covering up?

Doubts raised over BA's power problem claims - ITV News

Heathrow Harry 31st May 2017 06:06

"An electrician had been called in to do some work on lighting and he systematically pulled the fuses to see which circuit the faulty lighting was on"

There is/was a persistent rumour around Aberdeen that back in the 80's an engineer pressed the large red Emergency Shutdown button on the Forties platform offshore to "check that it worked" - it did............. took over 3 days and the loss of 300,000 barrels of production - say $60 million - to get it back on stream.................

DaveReidUK 31st May 2017 06:25


Originally Posted by underfire (Post 9787938)
In reality, systems and backup systems are designed to protect for reduced power (brownouts) and lack of power (blackouts) and power spikes, such as lightening strikes...BUT power surges, which may last several minutes, overloading primary, and depending on duration, backup systems, are obviously a different issue.

As previously posted, the electricity supply companies have refuted BA's claim that there was a surge in the power supply.

MYvol 31st May 2017 07:04

Big buttons are easy to press.
 

Originally Posted by Ian W (Post 9787673)
In a machine room from my past, that no longer exists, a security guard groped for the light switch and powered down the research data center using the emergency switch despite its switch cover - rather obviously the lights did not come on although several smaller ones went off :eek:. So the guard groped further and found the light switch and realized that the wrong switch had been used so switched it back on again and went on his way not realizing the chaos caused by the few minutes interruption and uncontrolled restoration. :D

This sounds like a similar occurrence.

I was thinking that it's not that easy to shut down a redundant UPS, then I read your post and recalled constantly having to stop visitors pressing the large (green?) button they thought would release the door lock. Eventually they were covered with a flap which helped a bit.

dsc810 31st May 2017 07:54


Originally Posted by DaveReidUK (Post 9787975)
As previously posted, the electricity supply companies have refuted BA's claim that there was a surge in the power supply.

....and anyway even my UPS unit from APC will protect against both under and over volts as well as surges.
Most recently an 11KV fault took out 2 phases and left one (mine) on.
This resulted in all sorts of power spikes and gyrations with the voltage (UK) on my line raising to 283 volts briefly before it settled down at 277.
My UPS simply cut in and took over regulating the voltage down to normal one and continued to do so for the next hour.
Then it took over completely when SSE cut the 11KV feed totally and instead installed a lorry mounted generator at the local substation while they took two days to sort out the damage.

Mine is a simple "off line" type UPS which monitors and switches in and out according to the incoming line conditions
A commercial grade "online type" UPS would be far better at smoothing out disturbances.
Online type ups's sit continuously between the incoming supply and the items protected so the feed is actually from the UPS at all times.

Epsomdog 31st May 2017 09:06


Originally Posted by Sunfish (Post 9787921)
.. Let me be clear. This is Board and senior management failure on a simply unpardonable level.

Since my view of the quality of British "management" is unprintable, let me speculate on what will happen next, as some folk here have stated, the innocent will be punished, the guilty will be promoted and management will go on its merry way, having learned precisely nothing..

Reminds me of an old RAF phrase "Promoted Beyond the Level of Incompetence" a term that fits a lot of managers nowadays.

Caribbean Boy 31st May 2017 09:55

In the Peter Principle, employees are promoted to their level of incompetence, and remain in that position.

Gordomac 31st May 2017 10:16

Never liked computers and told everyone that this would, one day, happen. Told you so. Oh & when I was first told to look at Pprune as people were talking about me, I googled "Pee prunes". The stuff that hit me through the junk mail nearly cost me my marriage. Oh & don't get me started about the high-tech Airbus !

Sunfish 31st May 2017 10:26

Underfire:


In reality, systems and backup systems are designed to protect for reduced power (brownouts) and lack of power (blackouts) and power spikes, such as lightening strikes...BUT power surges, which may last several minutes, overloading primary, and depending on duration, backup systems, are obviously a different issue.

While a loss of power or power reduction is protected within backup systems, sustained power surges are not typically protected, as a redirect of power is difficult.

It will be interesting to see, while the BA issue was newsworthy, what other systems were affected, if any.
Lets get one thing clear. The BA systems are whats called "mission critical" - meaning they are expected to be designed to work no matter what the failure, short of a nuclear attack.

Hence talk of brown outs, power surges, leaking/failing batteries, momentary interruptions, etc. as possible unforeseen causes is irrelevant. A mission critical backup system is supposed to work no matter what the cause of the failure of the primary system, no ifs, no buts.

To put that another way, a truck running into a power pole,, a ham handed excavator driver, an intern hitting the wrong button, ice on the points, floods, fire, etc. are not sufficient excuse for the failure of a mission critical backup system, in fact, those events are some of the particular conditions to be designed for.

There is no excuse for the failure of the BA IT systems except poor design and execution.

..But what does one expect from a country that is continuously surprised every winter when there is a small snowfall.

Mac the Knife 31st May 2017 10:39

I have some, relatively small scale, experience in such matters and no "Power Surge" short of a massive nuclear EMP, is going to take out my systems.

All this hand-waving about "power surges", as if it were some kind of excuse, is a straight out lie (even if it were true, which we know was another lie). A big essential system like BA's should properly be architected in such a way that NOTHING will take it right down for days on end.

Degraded for a few hours has to be the worst case scenario.

There is simply no excuse at all that would cut it with any competent systems manager - ergo, they had no competent systems manager either in house or out.

And for any big company to outsource their IT management is very very foolish. One largish company that I am familiar with has someone reasonably senior in their server room 24/365 and at least two other branches with spare capacity to take over within minutes.

Incompetence, lies, injudicious outsourcing and penny-pinching on one of their core resources.

Plain stupid.

pax britanica 31st May 2017 11:19

In any large telecoms company facility or Data centre operator the UPS is not just a simple back up system . It does what it saysd , it is an uninteruptable power supply and it sits between things like diverse mains inlets and on site generators. It manages the switch from one to other if required and very importantly smooths out or buffers any spikes which are common on any electrical system underload if it is suddenly interrupted or switched. So that s like one per BT exchange and goodness knows what for a BT Network Ops centre or even a modest data centre.

Havign worked in that industry power problems are not unknown but you have to protect against them in every possible way at critical facilities. That said we ahd management who would come up with idiotic comments like , its not going to produce any more revenue can you delay replacing for 2 years and two years later a different genius asks the same question.

This is an almighty cock up on BAs part and there are huge incentives for the management to be evasive about what happened especially as there wont be any independent investigators.


All times are GMT. The time now is 06:13.


Copyright © 2024 MH Sub I, LLC dba Internet Brands. All rights reserved. Use of this site indicates your consent to the Terms of Use.