Schrodinger's DevOps - Why You Need Visiblity Before Automation
Updated on May 31, 2018
by Greg Pollock
So a cat walks into a bar. No, that’s not right. He walks into a box. The cat gets bombarded with radiation. It used to be a bar but a lot of people died from the radiation so they turned it into a box. Is the cat dead or alive?
This is, of course, the story of Schrodinger’s cat and the answer to the question is “yes.” It is alive and dead and possibly a supercat until it is observed. That’s one of the paradigmatic shifts of quantum physics: the state of a system is affected by the observation of that system.
The lesson of Schrodinger’s cat—that observation collapses uncertainty—is essential for the future of DevOps. Automation lets us do things really fast, but the fact that something has been automated doesn’t tell us whether it is correct. (Yes, we are claiming that there is a limit to the value of automation--we know that sounds crazy.) The only way to be certain that our automation is adding to our business value is to observe the system in which it takes place. Then everything falls into place: the uncertainty collapses, you understand the state of your system, and either your cat is fine or, hey, you’re getting a new cat!
The enterprise is caught in the cat zombie stage of automation. On one hand, the enterprise can let its best and brightest exercise their talent to use emerging technology as fast as possible. This will give them a collection of cutting edge tools that depends on increasingly heterogeneous environments, libraries, cookbooks, etc., each of which is really only understood by the people—or person—who wrote it. On the other hand, the enterprise can maintain consistency at the cost of speed. They will be stuck running an extreme three-legged race: thousands of engineers bound in lock step as more agile organizations eat their lunch one release cycle at a time.
The dilemma of the enterprise is the dilemma of every software developer writ large. Either everyone understands what everyone else has done—both in engineering and operations—or something breaks and you have to untangle it in the way that is most costly to the business and most stressful for your team.
The problem here is that neither moving faster nor slower towards automation will help. We have to resolve the ambivalence in the system with a third option: observation. Making it easy for everyone involved in your software development lifecycle to observe your system configurations enables them to move fast while heading in the same direction.
The cat is either going to be alive or dead. The sooner you figure it out, the better—for you and the cat. While your site might be working fine today, in the absence of observation you can’t know if you have zombie configuration differences hanging around to bite you in the future. Spare your system the torment of undeath. While it might be easier not to think about lurking config drift, it will be better for everyone if you look inside the box.
Misconfigurations are an internal problem that emanate from within the IT infrastructure of any enterprise; no hacker is necessary for massive damage to occur to digital systems and stored data. And the problem is pervasive, with Gartner estimating anywhere from 70% to 99% of data breaches result not from external, concerted attacks, but from internal misconfiguration of the affected IT systems.