PPRuNe Forums - View Single Post - MAX’s Return Delayed by FAA Reevaluation of 737 Safety Procedures
Old 3rd Aug 2019, 04:56
  #1717 (permalink)  
Water pilot
 
Join Date: Mar 2015
Location: Washington state
Posts: 209
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by mrdeux
Some years ago, I discovered a way to reliably, and repeatedly, make a 767, with autopilot engaged in VNAV, fly through the MCP altitude. I reported this to our tech people, who passed it along to Boeing. Very quickly I heard that they’d been able to replicate it in a system sim, and a red bulletin was soon issued. It was fixed in an update a few months later.

Fast forward ten years, and I was now flying the 747. An update came out, and lo and behold, the MCP bug had reappeared. Apparently the software had simply been modified to bypass the offending code, and a later update, had removed the bypass.

The point is that the software fix itself was not permanent.
That is a very important point, and something that I have unfortunately noticed in navigation systems recently. A lot of software these days is done with temporary contractors, so there is no institutional memory about why the code is done the way that it is -- and in my experience many programmers (especially newer ones) do not bother to look through the change history of the code to understand it. In addition, many programmers (both new and old) don't bother to document their decisions properly, despite all of the automated nag systems, management reviews, etc. (You tend to get a lot of boilerplate.)

On one machine that I worked on, there was a nexus of software bugs that had been revolving around each other for about a decade. There were three symptoms, all moderately difficult to fix. In condition 1, the machine would (very rarely) lock up. In condition 2, a critical table in the system was (very rarely) corrupted. In condition 3, the transaction log (used for distributed processing) was corrupted. I don't remember the exact relationship but it was something like if you fixed 1 & 2 you got 3, and if you fixed three you got 2, and if you fixed 2 and 3 you got 1. I think I encountered condition 2, but checking the change log (if any programmers are here, take note) I noticed that my fix which I was proud of was actually identical to code that had been put in place years ago.

It ended up being a real bear to more or less fix but this was in the dark ages and the two original programmers of the system were still with the company and we were all able to put our heads together and understand what was going on. It was one of those cases where the actual fix probably would have been to completely redesign the system but that was not an option (sound familiar?) Nowdays, I think the newly minted programmer contractor would simply fix the condition that they were presented with, the next newly minted programmer contractor would fix the next condition, and (as you experience) the end result would be bugs that pop in and out of the system at each release cycle. It would not surprise me all that much if all of the vaunted changes to MCAS get rolled back sometime in the future by newly minted programmers who never heard of Lion Air.
Water pilot is offline