As a provider of effective solutions for enterprise, Certsys was well-aware of the difficulty in assuring the resilience of IT environments, including its own. Part of their environment is hosted by AWS and as such they were quite familiar with the AWS Well-Architected Framework, made of up five pillars: Operational Excellence, Security, Reliability, Performance Efficiency and Cost Optimization. They believe that one major way of achieving and maintaining resilience of their AWS environment is to adhere to these five pillars and thus create a best-of-breed infrastructure, ensure that the AWS infrastructure delivers maximum benefit, and prevent common technical pitfalls. When they moved their environment to AWS, obviously, they followed all the Framework’s best practices and guidelines in architecting their infrastructure.
Nonetheless, even when cloud environments are correctly architected and follow best practices, IT environments are dynamic and this leads to a decline in their resilience, reliability and security over time. This is precisely what Certsys experienced and the reason they were in search of a way to avoid performance disruptions, service unavailability, outages and data-loss incidents.
It’s true that companies whose IT environments are hosted on AWS see fewer of these disruptions, and the AWS Well-Architected Framework is one of the reasons why. Still, misconfigurations and single points of failure in IT environments are the main causes of disruptions and outages.
Why resilience decreases over time
Many factors contribute to complexity of cloud environments and the propensity for misconfigurations, including the high velocity of changes in an AWS environment, knowledge gaps between people and teams that maintain the environment and make changes, insufficient controls and lack of visibility. All these provide a fertile field for configuration errors and risks to occur.
Certsys needed the assurance of knowing their environment was always in good standing with respect to the AWS Well-Architected Framework, the standard for resilience and reliability the company set for itself. They were certain that this would be the key to meeting their goals for disruption-free 24X7X365 availability and security. They turned to Continuity Software for its Coral™ for AWS solution.
Certsys tests Coral™ for AWS
Certsys understands Coral’s potential; they test the solution.
Certsys knew that its AWS environment’s adherence to the AWS Well-Architected Framework had to be handled automatically and proactively and that they could achieve that by using Coral for AWS.
In early 2020, Certsys conducted a 14-days trial of Coral™ for AWS on a representative subset of the production environment of 500 nodes deployed in two regions: US East (Ohio) and South America (São Paulo). The nodes covered critical applications such as their CRM system. The main AWS services used by the company are EC2 instances, RDS instances, ASG, IAM, VPC, and others.
How Coral for AWS achieves and maintains IT resilience?
Coral is a SaaS solution deployed on AWS that automatically and proactively detects misconfigurations and risks across all components of AWS environments including virtual machines, containers, networks, load balancers, databases, cloud storage, DNS, and more.
To identify these risks the solution accesses its proprietary knowledge base containing hundreds of rules covering the best practices needed to maintain the AWS Well-Architected Framework for each of the five pillars. This process allowed Certsys to gain visibility into their AWS environment and configuration, pinpoint problem areas and enable their repair before they lead to a security breach, costly disruptions or outages and impact business.
The Coral UI shows all identified risks and provides a detailed description of each problem and the recommended steps for resolving it. The dashboard shows an overall Health score and a breakdown of risks by region, urgency, impact, and domain.
The pie charts below show the breakdown of risks to the various AWS Well-Architected Framework pillars detected by the solution.
It’s important to note that since the Coral trial was conducted on only a portion of Certsys’ nodes, it must be assumed that there are additional risks in the portion of the production and staging environments not scanned.