I’m not really sure what point you’re trying to make, since I very clearly wrote I also don’t think this was Google’s fault (even if they did stop sending people through that area a mere couple of weeks after this incident).
I also don’t think it’s fair to blame these people for this, and so I’m trying to understand what you would’ve done differently in the same situation.
Your proposed solution to overly complex systems seems to be to ignore the requirements that make them complex in the first place. If that works for you, this is a perfectly fine approach. But most companies with actual signed SLAs won’t accept “we’ll just have a few seconds of downtime/high latency every time a developer deploys something to production #yolo”.