Since Delta incident was mentioned and all the talk about redundant power. James Hamilton put up a nice article the other day of what had happened (and for the very pro, how to avoid it):
At Scale, Rare Events aren?t Rare ? Perspectives
(and even ultra-redundant Amazon AWS was able to thrash S3 service for over 4 hours last month
)
Real desaster was the time to recover PLUS the horrific communications.