Updated on July 26, 2017 by Alan Sharp-Paul
Now do me a favor. Repeat the above five times aloud. Breathe in deeply the scent of your own delusion, splash ice cold water on your face and, when you're ready, rejoin me back in reality. For the tale of automation is one of broken promises and unsubstantiated hype, of Enterprise-averse open source projects backfilled with support subscriptions and hidden armies of professional service consultants.
Automation is a noble goal but the road to automation in the Enterprise is littered with corpses. After all, very few successful Enterprise initiatives start with the words:
We need to train everyone in ruby
So what am I suggesting? Forget automation altogether? Of course not. It is a noble goal for a reason. The main risk I am cautioning against is rushing full steam into automation without first laying the groundwork for your success.
451 Research are seeing similar trends. Here's what Senior Research Analyst Jay Lyman had to say on the topic:
I think there is a tendency to think that large enterprise organizations, with all of their divisions and teams and silos, are capable of doing what Facebook or Netflix have done with their cutting edge implementations of configuration management tools. In reality, all of the legacy technology and process has to be taken into account as well. So no, everyone does not have to learn Ruby, but they do all have to work better with one another, including across teams and within the boundaries of the technologies they use.- Jay Lyman, Senior Research Analyst @ 451 Research
Pure greenfields implementations and A+ team organizations aside, what we've seen in our experience working with companies in this space is that there are three waves to traverse to guarantee success with an automation initiative. Most efforts fall down because they skip the first two entirely.
Visibility -> Accountability -> Automation
In the remainder of this post I'll explain what each wave means, and why each one is critical to sustainable automation within the Enterprise.
A common misconception for Enterprises commencing their automation journey is that the key preparation work is choosing a tool and training their staff up. These are necessary evils, sure, but the real work is actually gathering requirements. With legacy infrastructure in play what matters most is getting visibility of current state.
What people realize very quickly is that this is no mean feat. Yes, you can limit the initial pain by keeping your focus tight, but it will not change the fact that every automation script you build will require a combination of stakeholder interviews, documentation trawling and manual investigation in order to get it right. It is crucial work, but it can be a nightmare. It's tempting to jump straight to writing automation scripts but to do so without a full appreciation of your current state is suicide.
Don't automate what you don't understand
How you get this visibility is up to you. If you are relying solely on documentation and the recollections of your coworkers I wish you all the luck in the world. You are going to need it.
With your infrastructure and configuration understood you are ready to progress to the second wave, accountability. Why can't we start automating once we understand the state, you ask? Well, you may now understand your state but if you can't validate it how will you gauge the success of your automation efforts? More importantly, how will you maintain their quality over time?
The fact is that overarching visibility of your state is a great start, but you are only truly ready for automation if you can also validate the key components of your configuration at any point in time. Are the configuration files in place? Do they contain the right settings? Is this port open? Is that port closed? Are we patched sufficiently? Are we using the correct version of this package?
To gain accountability for your apps and infrastructure configuration you need to first define this desired state. To enforce it though it should also be executable, because manually validating this state is near impossible.
Don't automate what you cannot validate
But my automation tool will do this for me, you bleat. Really? If you think that the tool building your apps and infrastructure should also be the one validating it then I would beg to differ. What is validating that the automation tool is correctly configured? How is it managing non-functional requirements, especially negative checks (such as the aforementioned closed port check)?
Now, with your infrastructure and application configuration understood and your desired state verifiable, you are at last ready for automation. The best part is that the solid foundation you laid down in the first two steps gives you the ultimate freedom in tool choice. Choose Puppet. Go with Chef. Try Ansible. If you're happy with bash for now then that's OK too. There's no need to rush into anything.
The known and testable environment you have in place means that you can rest assured that your configuration is sound no matter how your builds are being done. The effects of a manual build will be just as visible, and just as verifiable, as those brought to life using the most elegant recipe, manifest or playbook.
Your automation efforts can be targeted and deliberate, and as such, they can remain under control. This is important, as many an Enterprise has learned the hard way that the introduction of complicated automation scripts to manage complicated server builds simply ends up substituting one management headache for another.
Just remember, automation tools are not silver bullets. In the right hands they can work wonders. In the Enterprise, success in using them is far from guaranteed. Quite the opposite in fact.
Social proof of this is the number of people willing to jump ship when a new tool pops up. Examples like this are common:
Now don't get me wrong, Ansible has made great strides in simplifying automation. When people who've successfully traversed the dreaded initial learning curve with a tool like Chef are calling time on using it so flippantly though there has to be something fundamentally wrong. We don't believe it's the tools themselves. It's just the approach people are taking with them.
So, a new DevOps mantra for all you Enterprise folk who are looking at IT automation.
Your simple plan for automation success.
Learn more about the complete DevOps toolchain and strategy with this free eBook...
Misconfigurations are an internal problem that emanate from within the IT infrastructure of any enterprise; no hacker is necessary for massive damage to occur to digital systems and stored data. And the problem is pervasive, with Gartner estimating anywhere from 70% to 99% of data breaches result not from external, concerted attacks, but from internal misconfiguration of the affected IT systems.