U.K. NATS Systems Failure
I'm increasingly baffled by the 'duplicate waypoint name' issue. This suggests - it's surely impossible - that waypoints are known to NATS as strings of characters. Surely each waypoint has a globally unique identifier or key? The flightplan has to mean something to the personnel, so they can select 'INGOR, ANNET, NAKID...' or whatever, but behind the scenes each of those should be a unique id. What am I missing?
I also have no idea why the system was allowed to get into a state where it decided it was utterly untrustworthy and collapsed. Planes have multiple software paths so that a rogue path can be outvoted by the other two. Why didn't the NATS system look at its own performance against all the other flights it was handling, and make at least an interim choice to keep going while it flagged up the anomaly?
I also have no idea why the system was allowed to get into a state where it decided it was utterly untrustworthy and collapsed. Planes have multiple software paths so that a rogue path can be outvoted by the other two. Why didn't the NATS system look at its own performance against all the other flights it was handling, and make at least an interim choice to keep going while it flagged up the anomaly?
Join Date: Oct 2004
Location: Southern England
Posts: 483
Likes: 0
Received 0 Likes
on
0 Posts
Join Date: Mar 2008
Location: London
Age: 69
Posts: 148
Likes: 0
Received 0 Likes
on
0 Posts
Continuing with Simon Calder's speculation as to which flight caused the issue, I did wonder whether confusion between Las Flecheras airport (SVSR/SFD) in San Fernando de Apure, Venezuelaand the Seaford waypoint might have been the problem. The quoted 4000nm distance would fit that. But from the description of the circumstances I would have expected the system to pick up WAFFU in the event of not being given an exit point.
OK, poor choice of words. Didn't we decide that, while we may not be able to conclusively rule out the "duplicate waypoint" scenario, there is no specific evidence pointing to it ?
Join Date: Oct 2004
Location: Southern England
Posts: 483
Likes: 0
Received 0 Likes
on
0 Posts
The error chain in the report only makes sense if there was a duplicate waypoint. It still makes sense if there was no duplication in the actual flightplan as long as there was a duplicate between a waypoint in the plan and a point just past UK airspace not necessarily on the aircraft's route. There is as you say no "evidence" because the interim report doesn't include the flightplan or identify the waypoint concerned but if you mean simple duplication wasn't the scenario then I'd agree we can probably discount that. If you are saying that duplication played no part in this then you are saying the report is false. If so it would be interesting to speculate what they have to gain by fabricating a fictional error chain rather than the actual one.
Join Date: Nov 2018
Location: UK
Posts: 82
Likes: 0
Received 0 Likes
on
0 Posts
I'm not sure I understand your clarification.
The error chain in the report only makes sense if there was a duplicate waypoint. It still makes sense if there was no duplication in the actual flightplan as long as there was a duplicate between a waypoint in the plan and a point just past UK airspace not necessarily on the aircraft's route. There is as you say no "evidence" because the interim report doesn't include the flightplan or identify the waypoint concerned but if you mean simple duplication wasn't the scenario then I'd agree we can probably discount that. If you are saying that duplication played no part in this then you are saying the report is false. If so it would be interesting to speculate what they have to gain by fabricating a fictional error chain rather than the actual one.
The error chain in the report only makes sense if there was a duplicate waypoint. It still makes sense if there was no duplication in the actual flightplan as long as there was a duplicate between a waypoint in the plan and a point just past UK airspace not necessarily on the aircraft's route. There is as you say no "evidence" because the interim report doesn't include the flightplan or identify the waypoint concerned but if you mean simple duplication wasn't the scenario then I'd agree we can probably discount that. If you are saying that duplication played no part in this then you are saying the report is false. If so it would be interesting to speculate what they have to gain by fabricating a fictional error chain rather than the actual one.
I'm increasingly baffled by the 'duplicate waypoint name' issue. This suggests - it's surely impossible - that waypoints are known to NATS as strings of characters. Surely each waypoint has a globally unique identifier or key? The flightplan has to mean something to the personnel, so they can select 'INGOR, ANNET, NAKID...' or whatever, but behind the scenes each of those should be a unique id. What am I missing?
I also have no idea why the system was allowed to get into a state where it decided it was utterly untrustworthy and collapsed. Planes have multiple software paths so that a rogue path can be outvoted by the other two. Why didn't the NATS system look at its own performance against all the other flights it was handling, and make at least an interim choice to keep going while it flagged up the anomaly?
I also have no idea why the system was allowed to get into a state where it decided it was utterly untrustworthy and collapsed. Planes have multiple software paths so that a rogue path can be outvoted by the other two. Why didn't the NATS system look at its own performance against all the other flights it was handling, and make at least an interim choice to keep going while it flagged up the anomaly?
The ADEXP waypoints plan included two waypoints along its route that were geographically distinct but which have the same designator.
Although there has been work by ICAO and other bodies to eradicate non-unique waypoint names there are duplicates around the world. In order to avoid confusion latest standards state that such identical designators should be geographically widely spaced. In this specific event, both of the waypoints were located outside of the UK, one towards the beginning of the route and one towards the end; approximately 4000 nautical miles apart.
Although there has been work by ICAO and other bodies to eradicate non-unique waypoint names there are duplicates around the world. In order to avoid confusion latest standards state that such identical designators should be geographically widely spaced. In this specific event, both of the waypoints were located outside of the UK, one towards the beginning of the route and one towards the end; approximately 4000 nautical miles apart.
As a result, at the point of publication of this report many lines of enquiry remain ongoing. These include, but are not limited to:
...
9) The feasibility of working through the UK state with ICAO to remove the small number of duplicate waypoint names in the ICAO administered global dataset that relate to this incident.
...
9) The feasibility of working through the UK state with ICAO to remove the small number of duplicate waypoint names in the ICAO administered global dataset that relate to this incident.
Join Date: Oct 2004
Location: Southern England
Posts: 483
Likes: 0
Received 0 Likes
on
0 Posts
Given they've been trying to do so for many years and still had over 3000 the last time I saw any reports then yes.
States are all in favour of removing them as long as it isn't theirs which is removed and of course one very large state with a lot of waypoints may not be engaging at the present time.
Until this incident the major risks with duplicates was considered to be passing a crew a clearance or instruction & them setting off for the wrong one. Most states have controlled that by instructing controllers not to issue clearances to duplicate waypoints and making sure significant points such as exit and entry points and those on procedures aren't in that duplicate lists. All new points are supposed to be demanded through ICAO's tool which doesn't allow duplicates.
This incident probably doesn't introduce a greater risk. Far easier to test & fix your software now you know of the circumstances.
States are all in favour of removing them as long as it isn't theirs which is removed and of course one very large state with a lot of waypoints may not be engaging at the present time.
Until this incident the major risks with duplicates was considered to be passing a crew a clearance or instruction & them setting off for the wrong one. Most states have controlled that by instructing controllers not to issue clearances to duplicate waypoints and making sure significant points such as exit and entry points and those on procedures aren't in that duplicate lists. All new points are supposed to be demanded through ICAO's tool which doesn't allow duplicates.
This incident probably doesn't introduce a greater risk. Far easier to test & fix your software now you know of the circumstances.
Join Date: Jul 2014
Location: UK
Posts: 116
Likes: 0
Received 0 Likes
on
0 Posts
Given they've been trying to do so for many years and still had over 3000 the last time I saw any reports then yes.
States are all in favour of removing them as long as it isn't theirs which is removed and of course one very large state with a lot of waypoints may not be engaging at the present time.
Until this incident the major risks with duplicates was considered to be passing a crew a clearance or instruction & them setting off for the wrong one. Most states have controlled that by instructing controllers not to issue clearances to duplicate waypoints and making sure significant points such as exit and entry points and those on procedures aren't in that duplicate lists. All new points are supposed to be demanded through ICAO's tool which doesn't allow duplicates.
This incident probably doesn't introduce a greater risk. Far easier to test & fix your software now you know of the circumstances.
States are all in favour of removing them as long as it isn't theirs which is removed and of course one very large state with a lot of waypoints may not be engaging at the present time.
Until this incident the major risks with duplicates was considered to be passing a crew a clearance or instruction & them setting off for the wrong one. Most states have controlled that by instructing controllers not to issue clearances to duplicate waypoints and making sure significant points such as exit and entry points and those on procedures aren't in that duplicate lists. All new points are supposed to be demanded through ICAO's tool which doesn't allow duplicates.
This incident probably doesn't introduce a greater risk. Far easier to test & fix your software now you know of the circumstances.
I'm genuinely stunned that the system works by parsing and comparing character strings. It's as if it was written in FORTRAN in 1977.
Join Date: Oct 2004
Location: Southern England
Posts: 483
Likes: 0
Received 0 Likes
on
0 Posts
If a global dataset exists, it would hardly be difficult to assign a unique key to each listed waypoint. That way, you could have ten BUNKA waypoints in the US - heck, ten BUNKA waypoints in Indiana alone - and the system would have no problem working out you weren't planning a sudden diversion to Africa.
I'm genuinely stunned that the system works by parsing and comparing character strings. It's as if it was written in FORTRAN in 1977.
I'm genuinely stunned that the system works by parsing and comparing character strings. It's as if it was written in FORTRAN in 1977.
We then put that data into much more sophisticated systems and the first thing they need to do is parse that character string and convert it into something more like you are talking about. It was during that operation that things went wrong.
Things in aviation move very slowly and usually at the speed of the slowest least capable state. There is work to move this along but it'll be a while yet.
And global dataset would be an exaggeration. There are organisations with a dataset but no true single point of truth. Not everyone uses the ICAO tool and it never had a complete set to start with so even that isn't complete.
Pegase Driver
Join Date: May 1997
Location: Europe
Age: 74
Posts: 3,694
Likes: 0
Received 0 Likes
on
0 Posts
as pointed out the ICAO flight plan is a world standard designed in the 1940s , ir was coded in such form that it could be transmitted quickly and most importantly, to be understood by every country on the globe.
How the flight plan is processed to fit the ATC system is a local decision . It can be done to just print strips , activate a basic flight plan processing system ,using off the shelf PCs , or using very sophisticated systems like the one currently used by NATS . And there is no standard for doing this .
Changing the flight plan format to the 21st century and defining a digital standard that every country ( including the US ) would accept and then retrofit to is being discussed and worked on since decades , and likely will take many more decades to see it implemented.
That said , a bug can always put down a complex system, the French 4Flight or Maastricht MADAP are as much if not more sophisticated systems than the NATS one , they.also do have such failures from time to time , but their various back up systems activate to enable a quasi transparent restart for the users . Why the back up did not work as ( hopefully) designed is the real issue , and why it took them so long to isolate the problem the other .
Focussing on a wrong flight plan , or a duplicate 5 letters waypoint , is not really the issue here , in my view at least as you can never be able to eliminate 100% of them .
How the flight plan is processed to fit the ATC system is a local decision . It can be done to just print strips , activate a basic flight plan processing system ,using off the shelf PCs , or using very sophisticated systems like the one currently used by NATS . And there is no standard for doing this .
Changing the flight plan format to the 21st century and defining a digital standard that every country ( including the US ) would accept and then retrofit to is being discussed and worked on since decades , and likely will take many more decades to see it implemented.
That said , a bug can always put down a complex system, the French 4Flight or Maastricht MADAP are as much if not more sophisticated systems than the NATS one , they.also do have such failures from time to time , but their various back up systems activate to enable a quasi transparent restart for the users . Why the back up did not work as ( hopefully) designed is the real issue , and why it took them so long to isolate the problem the other .
Focussing on a wrong flight plan , or a duplicate 5 letters waypoint , is not really the issue here , in my view at least as you can never be able to eliminate 100% of them .
Dual named waypoints
Perhaps I'm barking up a wrong tree and simplifying what I understood the problem to be.......... Was it filing a flight plan route via distant-from-each-other waypoints wiith the same name or identifier codes? But surely that couldn't have been what happened?
For example : What if you flew from the Middle East to the UK and your route took you over Muscat (Oman) VOR code = MCT to overfly Manchester (UK) VOR code = MCT ? Would the same problem re-occur ? Victor Mldrew would say, "I don't belieeeeve it."
For example : What if you flew from the Middle East to the UK and your route took you over Muscat (Oman) VOR code = MCT to overfly Manchester (UK) VOR code = MCT ? Would the same problem re-occur ? Victor Mldrew would say, "I don't belieeeeve it."
Last edited by Jetset 88; 11th Sep 2023 at 21:10. Reason: typo
Join Date: Oct 2004
Location: Southern England
Posts: 483
Likes: 0
Received 0 Likes
on
0 Posts
Perhaps I'm barking up a wrong tree and simplifying what I understood the problem to be.......... Was it filing a flight plan route via distant-from-each-other waypoints wiith the same name or identifier codes? But surely that couldn't have been what happened?
For example : What if you flew from the Middle East to the UK and your route took you over Muscat (Oman) VOR code = MCT to overfly Manchester (UK) VOR code = MCT ? Would the same problem re-occur ? Victor Mldrew would say, "I don't belieeeeve it."
For example : What if you flew from the Middle East to the UK and your route took you over Muscat (Oman) VOR code = MCT to overfly Manchester (UK) VOR code = MCT ? Would the same problem re-occur ? Victor Mldrew would say, "I don't belieeeeve it."
Join Date: Oct 2004
Location: Southern England
Posts: 483
Likes: 0
Received 0 Likes
on
0 Posts
Probably not. Without knowing which waypoint we don't know for sure but the report suggests that the issue only occurs if there is a waypoint earlier in the plan which is a duplicate of another point just beyond the UK FIR and, importantly, the exit point from the UK FIR is not explicitly included in the plan.
Join Date: Oct 2004
Location: Southern England
Posts: 483
Likes: 0
Received 0 Likes
on
0 Posts
It would be quite a difficult shortcoming to exploit. Merely filing a plan wouldn't be sufficient even if you could convince the "system" that you intend to fly that route.
Probably worth reading this:
https://jameshaydon.github.io/nats-fail/
The algorithm used to find the UK portion is, not to put too fine a point on it, dumb. It works by searching forwards through the flight plan until it finds the entry point, then skips to the end and searches backwards, which will have all kinds of exciting consequences if:
- the route leaves UK airspace and re-enters it (the leg or legs outside the airspace will be wrongly included as part of the UK portion)
- the route exits UK airspace through the same waypoint it entered it (the leg within UK airspace will disappear)
- the exit point is not explicitly stated and a duplicate is present (what happened this time)
- probably more cases I haven't thought of
Compare the following program: from the beginning of the plan, check each waypoint to see if it's in the UK. When the first UK waypoint is found, create a UK leg starting at that waypoint. Add subsequent waypoints to the leg if they are in the UK. When a foreign waypoint is found, end the UK leg after the preceding waypoint and add it to the list "UK legs" under the flight plan ID. Continue until you have no more waypoints, and move to the next flight plan.
You'll observe that this copes fine with loops and missing entry/exit points, although we still need to check for duplicates explicitly; a simple dupe catcher would be to flag any UK leg that contains exactly one waypoint for review, because either it's a route that passes in and out over the same point without going anywhere else in the UK (weird but I suppose just possible) or it's a duplicate.
Even if those edge cases are rare and weird, they're not impossible, and of course someone might file a malformed plan maliciously now they know how to break the system.
https://jameshaydon.github.io/nats-fail/
The algorithm used to find the UK portion is, not to put too fine a point on it, dumb. It works by searching forwards through the flight plan until it finds the entry point, then skips to the end and searches backwards, which will have all kinds of exciting consequences if:
- the route leaves UK airspace and re-enters it (the leg or legs outside the airspace will be wrongly included as part of the UK portion)
- the route exits UK airspace through the same waypoint it entered it (the leg within UK airspace will disappear)
- the exit point is not explicitly stated and a duplicate is present (what happened this time)
- probably more cases I haven't thought of
Compare the following program: from the beginning of the plan, check each waypoint to see if it's in the UK. When the first UK waypoint is found, create a UK leg starting at that waypoint. Add subsequent waypoints to the leg if they are in the UK. When a foreign waypoint is found, end the UK leg after the preceding waypoint and add it to the list "UK legs" under the flight plan ID. Continue until you have no more waypoints, and move to the next flight plan.
You'll observe that this copes fine with loops and missing entry/exit points, although we still need to check for duplicates explicitly; a simple dupe catcher would be to flag any UK leg that contains exactly one waypoint for review, because either it's a route that passes in and out over the same point without going anywhere else in the UK (weird but I suppose just possible) or it's a duplicate.
Even if those edge cases are rare and weird, they're not impossible, and of course someone might file a malformed plan maliciously now they know how to break the system.