I mean, you may not be Google, but... oh, hmm, yeah.
Our reminder of the week that configuration management is hard, that it is prone to errors, and that those errors can have a massive impact comes from none other than the auteurs of automation, the commanders of complexity themselves, Google.
Yep, thanks to a bug in "an internal system that generates configuration", Gmail was down for at least 20 (and up to 50) minutes on Friday.
"The incorrect configuration was sent to live services..., caused users’ requests for their data to be ignored, and those services, in turn, generated errors.", said VP of Engineering Ben Traynor.
I don't need to explain the impact up to 50 minutes of downtime for Gmail represents. Consider the event a community service announcement though (one to follow up Dropbox's the previous week). Configurations matter. You need to monitor your configurations for drift. You should control them through executable policies, for compliance as well as functionality.
You can have the best systems in the world, and the smartest people to run them, but it won't mean a thing if your configuration isn't solid.