It's all to do with Doppler shift, wave compression, air density and M number reduced to speed in metres/sec and it's very complicated. Unfortunately human hearing really only operates on one pulse repetition frequency (PRF) which causes ambiguities.
I would say therefore that a reasonable off hand explanation for some one who is not an aeronautical engineer would be that it's all due to variations of air compressibility at altitude brought about by the extreme cold air bearing in mind that at ISA - 15c at sea level, calaculate the ISA temperature at say 20,000ft.
Toodle pip?