PPRuNe Forums - View Single Post - BA delays at LHR - Computer issue
View Single Post
Old 6th Jun 2017, 10:44
  #556 (permalink)  
Ian W
 
Join Date: Dec 2006
Location: Florida and wherever my laptop is
Posts: 1,350
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by scr1
From that article:
However, an email leaked to the media last week suggested that a contractor doing maintenance work inadvertently switched off the power supply.The email said: "This resulted in the total immediate loss of power to the facility, bypassing the backup generators and batteries... After a few minutes of this shutdown, it was turned back on in an unplanned and uncontrolled fashion, which created physical damage to the systems and significantly exacerbated the problem."
This is an admission that the BA system was not designed as a fault tolerant system. It should not be possible to fail a distributed fault tolerant system by failing one data center however untidily. Similarly, by definition an untidy restart that caused various failures in an already 'failed' data center should be completely transparent to users just extend the length of time for that data center to be brought back up.

I can remember walking around doing acceptance testing in a system that _was_ fault tolerant randomly failing servers, disk drives boards within servers power supplies etc. and the system just kept going as it was designed to. The BA system was obviously not designed to be fault tolerant. Or the system had been put into a state where it was not fault tolerant by people not knowing what they were doing.
Ian W is offline