As the number of online services offered by enterprises continues to grow at a rapid pace, IT teams are playing an ever-important role, with less room for network downtime or failure than ever before.
Despite the need for IT stability, very few companies or organizations have gone without recent outages that have occasionally caused serious consequences. In fact, 78% of outages in 2015 were attributed to hardware failure, systems upgrades and human error while just 22% occurred as a result of cyber-attacks, according to a recent survey conducted by the Ponemon Institute and sponsored by Emerson Network Power.
In order for IT teams to gain greater scalability and efficiency, companies have started shifting towards software-defined datacenters, which are using virtualization and automation technologies. The opportunity to configure IT assets in a manner similar to the way we write code provides companies with an opportunity to execute tasks simply and on a large scale. But the risk of a critical system outage caused by a misconfiguration does not go away with the software-defined datacenter.
As a matter of fact, if certain configurations deviate from best practices, automation only brings about issues faster, in a way that is more difficult to pinpoint. Therefore, it is imperative that companies navigate the shift to a software-defined datacenter carefully, investing time and resources before and during the transition. Below are three suggested steps companies should take in order to successfully navigate the transition to a software-defined datacenter.
Although the idea of automated datacenter provisioning is highly appealing, the first question we need to ask ourselves is what we are automating. Just like modeling a self-driving car after a less-than-perfect driver, taking your existing configuration and making it self-running could be a risky proposition.
Even while your systems are up and running, no datacenter is clean from misconfigurations. With every new company we work with, we start with a quick health check of their environment. Each and every one of these health checks has revealed some hidden configuration risks that were unknown to the IT team.
Transitioning to the software-defined environment is a great opportunity to clean up any risks that may be lingering in your datacenter today. Before you etch your existing configuration into code, run a thorough health check of all related components.
While doing so, it’s important to remember to check for any cross-domain interdependencies: a misconfiguration in the storage layer may not manifest itself until you analyze how it is connected to your database, for example.
Once we have a clean environment to model, the next challenge is to ensure the code itself doesn’t introduce any additional risks. Writing these scripts is a new discipline, which means that IT teams are still somewhere on the learning curve, working with tools that are far from being mature at this point.
The prudent way to implement new scripts requires a thorough testing of the environment (preferably in a sandbox first) with every new version of the code. With that said, manually testing each and every configuration and all the related dependencies with each code release is extremely time consuming and practically unrealistic. The only way to make it work is to meet speed with speed – automated scripts require automated testing.
Automated testing allows team to achieve better test coverage while reducing testing time – both during the development of scripts as well as once deployed into the production environment.
The data shows that configuration changes, or configuration drift, is the number one reason for unplanned system outages. This is probably the biggest issue facing IT teams today, and it doesn’t go away with the software-defined datacenter.
It is highly unlikely that your software-defined datacenter will be one hundred percent automated in the near future. Configuration changes – either manual or through new scripts – are going to continue to creep into your environment on a daily basis, with automation probably contributing to further acceleration in the pace of change.