Go Back  PPRuNe Forums > Ground & Other Ops Forums > ATC Issues
Reload this Page >

All London airspace closed

Wikiposts
Search
ATC Issues A place where pilots may enter the 'lions den' that is Air Traffic Control in complete safety and find out the answers to all those obscure topics which you always wanted to know the answer to but were afraid to ask.

All London airspace closed

Thread Tools
 
Search this Thread
 
Old 15th Dec 2014, 07:11
  #101 (permalink)  
 
Join Date: Mar 2008
Location: London
Age: 69
Posts: 148
Likes: 0
Received 0 Likes on 0 Posts
CAA announce independent enquiry :

The UK Civil Aviation Authority (CAA) and NATS have agreed to the establishment of an independent inquiry following the disruption caused by the failure in air traffic management systems on the afternoon of Friday 12th December 2014.

The CAA will, in consultation with NATS, appoint an independent chair of the panel which will consist of NATS technical experts, a board member from the CAA and independent experts on information technology, air traffic management and operational resilience. The full terms of reference will be published following consultation with interested parties including airlines and consumer groups but it is expected that the review will cover, as a minimum:

1. The root causes of the incident on Friday
2. NATS’ handling of that incident to minimise disruption without compromising safety
3. Whether the lessons identified in the review of the disruption in December 2013 have been fully embedded and were effective in this most recent incident
4. A review of the levels of resilience and service that should be expected across the air traffic network taking into account relevant international benchmarks
5. Further measures to avoid technology or process failures in this critical national infrastructure and reduce the impact of any unavoidable disruption

For more information, please contact the CAA Press Office, on [email protected], or 020 7453 6030 out of hours 07789745636
Independent inquiry into air traffic control failure announced | CAA Newsroom | About the CAA
118.70 is offline  
Old 15th Dec 2014, 08:39
  #102 (permalink)  
 
Join Date: Oct 2002
Location: London UK
Posts: 7,651
Likes: 0
Received 18 Likes on 15 Posts
Originally Posted by Downwind Lander
What is needed is a serious level IT expert to be ready to comment on technical explanations as they come through. Any offers?
Not quite up to such a grandiose introduction, but nevertheless ......

Everyone seems to blame it on "the computer" but there is no really understandable technical description being offered so difficult to comment. And "it's old" is certainly not a technical description - what is needed for such is an explanation of why it has worked satisfactorily for so long before encountering an issue, and what caused the issue to manifest itself now.

But that's tech. As I understand it there was an outage for an hour or so. Aviation is of course well used to hour-long holdups for a wide variety of reasons. What REALLY needs to be investigated is why it then took so long for normality to be restored - there were still significant BA cancellations the following day.

This is something which increasingly afflicts not only aviation but also other transport modes like rail or road, the length of time taken to recover the service from an incident going ever upwards. It seems that NATS have been on a substantial staff reduction exercise; it is moments like this when you find out that those staff were actually doing something. Likewise for the airlines, the inability by some (not all) to have the resilience to come back from the various situations is one for them, not something to be just stuck on the ATC provider. The ability to blame it on "knock-on effect" is a glorious excuse for slowness and inertia rather than trying really hard to get things back straight again quickly. And that's nothing to do with computers.

Most notable of all is all the calls in the press for "investment" in replacement computers. Goodness me, the IT salesmen () must be smacking their lips at this early Christmas present, and some little placings by their PR teams with the media contacts whilst this is Hot News doubtless works wonders as well. Time and again the high-level know-nothings get themselves talked into spending money on new kit rather than dealing with the operational procedures and management which are the real issue. It's just like airport security. 10 security stations provided of which only 3 are staffed even at peak times, and a 30-minute queue. After many complaints, what's the solution ? More security stations, of course.
WHBM is offline  
Old 15th Dec 2014, 10:25
  #103 (permalink)  
 
Join Date: Mar 2008
Location: London
Age: 69
Posts: 148
Likes: 0
Received 0 Likes on 0 Posts
Doesn't "The Register" article imply that there was a combination of circumstances that prevented the usual responses to the failure of the flight data processing system (holding the database of filed flight plans) getting linked back to the central flight server (holding the radar data) within a critcally short period ? It sounds as though the failures of the flight data system are by no means uncommon but NATS is well rehearsed in getting it back on the road. Unfortunately a separate problem with the link resulted in busting the deadline to prevent the radar system complaining it was only holding stale data and forcing procedures with lower capacity to start.

I wonder which system has the "delinquent" line of code ?
118.70 is offline  
Old 15th Dec 2014, 17:54
  #104 (permalink)  
 
Join Date: Mar 2008
Location: London
Age: 69
Posts: 148
Likes: 0
Received 0 Likes on 0 Posts
A previous failure of the link between the National Airspace System and the National Flight Data Processing System in 2008 seems to have been reported in Computer Weekly :

Failure of Swanwick comms link leads to flight delays
118.70 is offline  
Old 15th Dec 2014, 19:16
  #105 (permalink)  
 
Join Date: Mar 2001
Location: etha
Posts: 300
Likes: 0
Received 0 Likes on 0 Posts
In any exceptional event in the UK which affects flow control, the knock on effect normally does take days to recover. When Heathrow loses both runways for 15 minutes and then is single runway for a further hour then the delays can be felt up to 3 days after. Exactly this occurred following the emergency return of the BA flight to Oslo after it lost the engine cowlings on departure. It wouldn't have made any difference if Heathrow had any more runways in operation at the time, purely due to SAFETY landings and departures were stopped. That is exactly what happened at Swanwick, measures were taken to provide ultimate safety, and then as the system recovered, the traffic was gradually increased again.

Who is at fault? Anyone who has an open mind will see exactly what the main UK ATC union has to say on the subject here on the Prospect website
zonoma is offline  
Old 15th Dec 2014, 19:39
  #106 (permalink)  
 
Join Date: Mar 2001
Location: etha
Posts: 300
Likes: 0
Received 0 Likes on 0 Posts
Also meant to mention that with the Belgians being on strike today, 600 flights have been cancelled.

Not one thread seen yet.

Just sayin........
zonoma is offline  
Old 15th Dec 2014, 22:22
  #107 (permalink)  
 
Join Date: Mar 2007
Location: In my head
Posts: 694
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by DelPrado
Slip and Turn, you have proven time and again on ATC threads that you don't know what you're talking about.
Well you might forgive me if I say I have reason to summarise slightly differently
I recall in the past you trying to argue separation standards with a Heathrow tower controller.
Could be true - does that mean that a mere mortal managed to engage with the next best thing to a God? Wow. I remember arguing that the vertical separation between departing London City traffic and descending Heathrow inbound crossing traffic in the area of Canary Wharf should be increased to allow much greater margin for a proven pilot error (level bust) hotspot. Simple as that, but of course it was already perfectly well reasoned, like everything else at NATS. I believe the levels were adjusted eventually (3000' minimum increased to 4000' for LHR inbounds joining westerly final from the north descending and crossing abeam western end of City?), but I have to say I haven't checked closely, so you might have a point.
Originally Posted by Gonzo
s&t,

Any shortfall in the pension, and there has been such over the past few years, has been met by increased employer contributions and changes in staff T&Cs.
So NATS being the employer, shouldered the shortfalls? And NATS is 49% publically owned? A bit like Lloyds Banking Group and RBS? Or nothing like Lloyds Bank and RBS ? But suffice to say that money used for plugging multi-million pound holes in gilt-edged pension commitments first made when NATS was part of the Civil Service, means a shortage in available funds for ongoing capital investment in operations, does it not? Can we agree on that? And the size of those pension fund shortfalls each year was? The accounts and reports for NATS and its various guises defies clarity of any sort at first glance, which one can only assume is deliberate. Can you decode for us and summarise, or do I have to Google more obscure documents like this one? And maybe DelPrado's 2012 re-sounding of a substantial 8 figure warning bell first rung 13 years ago in this thread: http://www.pprune.org/atc-issues/501...-reminder.html which never got heeded?

Not exactly the best transparency for a publicly owned entity, is it? Bits here and there and no-one really volunteering the big picture ... ?

A paragraph written 4½ years ago in that Government Actuary's Dept. document said this:
Originally Posted by (elsewhere) in a GAD document for CAA
The NATS scheme’s benefits are more generous than those provided by typical UK private sector DB schemes. Approximate calculations suggest that, if the NATS scheme’s benefits were to be more typical, the employer’s standard contribution rate could be around 25% of pay, compared to the actual rate of 37% of pay. The purpose of this calculation is solely to illustrate the broad effect of the level of the NATS scheme’s benefits on NERL’s projected contributions. We have not been asked to comment on the reasonableness of the level of the scheme’s benefits. We recognise that the NATS scheme’s benefits reflect the scheme’s public sector origins and protections put in place at privatisation.
The executive summary in that report also has a chart indicating that NATS En Route Plc's (NERL’s) projected pension contributions – £ million in constant 2008-09 prices terms (whatever that last caveat really means) were as high as £90M. We are 4½ years further on. How did they turn out, please? Or is the expenditure on pensions insignificant compared to capital expenditure on operations, and therefore a red herring in this thread?
And is the NERL pension expenditure the lion's share of NATS overall pension commitments or is there more buried elsewhere in the various books?

Last edited by slip and turn; 15th Dec 2014 at 22:40.
slip and turn is offline  
Old 15th Dec 2014, 23:22
  #108 (permalink)  
 
Join Date: Jan 2008
Location: The foot of Mt. Belzoni.
Posts: 2,001
Likes: 0
Received 0 Likes on 0 Posts
Slip and WHBM,
Last Friday, the folks wearing headsets and their immediate operational managers gave it their best shot.
Something happened that wasn't supposed to happen, and as a result, no-one died.
That's what it's all about.
ZOOKER is offline  
Old 16th Dec 2014, 04:09
  #109 (permalink)  
 
Join Date: Jan 2004
Location: London
Posts: 654
Received 9 Likes on 5 Posts
Folks, this whole stupid thread has been a trolling expediton, with a troublemaker knowing how to tweak sensibilities here.

Don't feed the troll! I don't think I have read such stupidity on this board before. It is designed to cause a hysterical reaction. Ignore.
Anyone considering engaging with Slip and Turn can I suggest you read this thread first?
Del Prado is offline  
Old 16th Dec 2014, 11:27
  #110 (permalink)  
 
Join Date: Feb 2006
Location: Hants
Posts: 2,295
Likes: 0
Received 0 Likes on 0 Posts
S&T
Far be it for me to feed a troll but:

You claim to be able to read accounts
you claim to understand pensions

You fail to understand that HMG 49% holding does not mean that we receive money from the taxpayer to bolster pension or for anything else, even for investment in new equipment.

NATS pays for the pension contributions through employee payments and from company gross earnings.

Any investment in future equipment etc is either paid for directly from earnings, or from business loans.

Instead of 'propping up NATS' HMG shouldered NATS with a large loan which meant that HM treasury pocketed over £600M during PPP, but NATS have to service the debt.

As for knowledge of equipment etc... it was the modern, new equipment that caused this failure. The oldests technology, found in TC, was completely unaffected.

I'm sure you won't understand how this could be if we had to reduce flow, but I've fed you enough, someone else might like to explain to you why we needed the restrictions even if TC was able to operate normally... I'm certain you won't know despite your protestations about you wealth of knowledge.
anotherthing is offline  
Old 16th Dec 2014, 12:17
  #111 (permalink)  
 
Join Date: Apr 2014
Location: London
Posts: 148
Likes: 0
Received 0 Likes on 0 Posts
Select committee to hold Deakin's feet to the fire:

Committee to question NATS and CAA over failure in air traffic management system - News from Parliament - UK Parliament

It might be on the UK "Parliament Channel", Freeview 131.
http://www.bbc.co.uk/parliament/prog...les/2014/12/19

Last edited by Downwind Lander; 16th Dec 2014 at 15:52.
Downwind Lander is offline  
Old 16th Dec 2014, 13:29
  #112 (permalink)  
 
Join Date: Oct 2007
Location: 30 Miles from the A1
Posts: 488
Likes: 0
Received 10 Likes on 5 Posts
zonoma - thanks for the link - that's possibly the most pragmatic statement I have ever seen from a Trade Union.
2Planks is offline  
Old 16th Dec 2014, 13:41
  #113 (permalink)  
 
Join Date: Aug 2006
Location: Lemonia. Best Greek in the world
Posts: 1,759
Received 6 Likes on 3 Posts
NATS does not currently benefit from Govt support for its pension plan.
Most of the actual pensioners in the pre-privatisation phase were put in the CAA's part of the plan.

The real cost was in the privatisation process, when HMG stuffed the CAA and the NATS pension funds full of taxpayers money. That was because prior to the privatisation, the liabilities had been HMGs.

However, if either the NATS or the CAA's pension plans hit problems, they will rush to HMG for more money. It's one of those "Is the Pope a Catholic" sort of questions.
Ancient Observer is offline  
Old 16th Dec 2014, 14:46
  #114 (permalink)  
 
Join Date: Dec 2006
Location: Florida and wherever my laptop is
Posts: 1,350
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by slip and turn
It's called error handling and it is an absolutely critical part of any computer program. If a line of code receives unanticipated data (which may not be 'bad' per se), that unforeseen use case needs to have been foreseen by whomever put together the program spec, whomever agreed the program spec, whomever designed the logic that was intended to handle it flawlessly in the code, whomever checked it, whomever tested it, and whomever signed off on the project or module or upgrade, but one or all six or sixty of whom we all now know was mistaken. And there's the rub

So now we are told that a single line of code stopped the machine, what actually was it in the real time real life world that was unforeseen? That would be the real story.

If I was anotherthing or Gonzo or Zooker or eglnyt et al, I'd have asked that one at the office by now
The NAS Host software that was written in 1969 - 1971 yes in Jovial and BAL (basic assembler language) is actually extremely reliable. However, it was made to run on a set of 6 IBM360's known (in the trade) as the IBM 9020D. UK CAA did not purchase the 9020E which was another team of 6 IBM360's that did radar data processing.

So the 9020D had 3 input output processing IBM360's and 3 compute element IBM360's - all running at an impressive 300,000 integer instructions a second.

Now the architecture is what made the system reliable. The system was a multiprocessor mufti-programming system and any program that was pre-empted could be picked up by another processor. The system repeatedly recorded checkpoint recovery data from once a second out to a few minutes. So if an error was found by the computer (what would give a BSOD in a PC) the IBM360 involved would stop all the other processors and give them the checkpoint data and all the processors would rerun precisely the same program and data. If only one of the processors got the error then the error must be hardware in that processor and it put itself offline. If all the processors got the error then the error must be software and the 9020 did a core dump (a large hexadecimal printout) threw away all its input messages then restarted (startover) from a clean checkpoint say 3 minutes before. As software faults in a real time system are normally timing/preemption related or caused by a broken input message, the system would normally startover successfully. Controllers would receive a message 'STARTOVER at time - please re-input any messages" (or words to that effect.) If Gork put in the broken message again then it could cause the startover again. However, the Data systems specialist would be looking at the last messages in and identify Gork's message and somewhat testily suggest that he did not re-enter the message next time.

OK so now the system is rehosted as a virtual machine inside a nice shiny new machine. A lot of the automated recovery that was built in may not work quite that way (I don't know how that is now implemented) So I rather think that it may take more manual intervention if the Host software has a glitch.
Ian W is offline  
Old 17th Dec 2014, 15:50
  #115 (permalink)  
 
Join Date: Apr 2010
Location: London
Posts: 7,072
Likes: 0
Received 0 Likes on 0 Posts
Heathrow Harry is offline  
Old 17th Dec 2014, 16:13
  #116 (permalink)  
 
Join Date: Apr 2014
Location: London
Posts: 148
Likes: 0
Received 0 Likes on 0 Posts
Typescript of Mondays Transport Select Comittee meeting with McLoughlin.

http://data.parliament.uk/writtenevi...oral/16712.pdf

Video of interviews with Deakin, Rolfe and Haines.
London air traffic control failure examined - News from Parliament - UK Parliament

Report to be out by the beginning of April.
Downwind Lander is offline  
Old 18th Dec 2014, 10:58
  #117 (permalink)  
 
Join Date: Mar 2008
Location: London
Age: 69
Posts: 148
Likes: 0
Received 0 Likes on 0 Posts
Draft terms of reference for the inquiry have appeared at

Terms of reference | Regulatory Policy | About the CAA

It mentions

including the measures that had been put in place to prepare for routine changes to systems that had occurred on the 11 December 2014 date for the regulated changes to aeronautical information (the AIRAC date) and for the move of additional workstations to support the military task that was re-locating from Prestwick.
and
  • The preparation and testing of premeditated operational/engineering changes to systems and procedures planned to take place on or about regular AIRAC dates or in association with particular infrastructure changes.
which is the first time I have seen links to system changes as being a potential contributory cause.
118.70 is offline  
Old 18th Dec 2014, 11:53
  #118 (permalink)  
 
Join Date: Dec 2006
Location: Florida and wherever my laptop is
Posts: 1,350
Likes: 0
Received 0 Likes on 0 Posts
The Flight Data Processing software uses an 'Adaptation Controlled Environment System' - that describes the airspace and a huge number of other parameters used by the FDPS as it is running. It is feasible that an error in the Adaptation update could lead to a program crash, but these changes are usually very carefully controlled. NATS has had a lot of practice doing them since the seventies.
Ian W is offline  
Old 18th Dec 2014, 14:44
  #119 (permalink)  
 
Join Date: May 2002
Location: Manchester uk
Posts: 2
Likes: 0
Received 0 Likes on 0 Posts
LATCC NAS Software

As the Software Engineer responsible for acceptance of the 9020 software by NATS in 1974 , I am concerned if it is still in use. Constant modification in dead computer languages and re-hosting over 30 years are not conducive to reliability .Part of the problem was and maybe still is NATS managements inability to successfully plan the next generation of systems whilst implementing the previous generations.
stanprice is offline  
Old 18th Dec 2014, 16:16
  #120 (permalink)  
 
Join Date: Aug 2001
Location: se england
Posts: 1,579
Likes: 0
Received 48 Likes on 21 Posts
I think this whole incident shows up something of the over reaction from the media of anything to do with the aviation industry.
Of course it was a serious problem and off course there will be management failings because that's life -management effectively means making do not being perfect because perfect is always unrealistically expensive.

Since the incident occurred I have heard every single day train services in the London area disrupted by signal failures between x and y. Signalling is the ATC of the rail industry , it is an essential safety feature and it is complicated-a lot of it is also quite old.

If you add all the signal failures on the National and urban rail networks tis year I would bet that they caused more inconvenience and more delay to many more people than the other weeks problem.
Is there a call for an inquiry, are the heads of the relevant service providers summoned to Westminster ? no they were not but they probably should have been because they caused at least as much chaos just over a longer time span
pax britanica is offline  


Contact Us - Archive - Advertising - Cookie Policy - Privacy Statement - Terms of Service

Copyright © 2024 MH Sub I, LLC dba Internet Brands. All rights reserved. Use of this site indicates your consent to the Terms of Use.