CloudZone Boost Reliability and Prevent Data Loss
CloudZone Adopts Continuity Software’s Coral™ for AWS to Boost Reliability and Prevent Data Loss
CloudZone helps enterprises make the move to the cloud, providing them with end-to-end cloud management service solutions, supported by DevOps engineers. CloudZone is committed to ensuring its customers adopt the most advanced technologies to improve reliability, optimize cloud infrastructure performance, increase data security and cut down on cloud costs. A significant part of achieving these aims involves deploying the right solutions to automate tasks, in order to run more highly-efficient cloud operations.
CloudZone is an AWS Premier Consulting Partner and has been working with AWS for a decade.
The ins and outs of managing AWS infrastructure are very well-known to CloudZone. The company’s IT environment is hosted on AWS as are the environments of most of their managed services customers. They have undeniably deep and wide experience with AWS, and adhere to the AWS Well-Architected Framework pillars. However, despite being a seasoned cloud managed service provider, they were concerned about core operational issues, especially performance degradation, disruptions and outages and imperfect data protection which could lead to data loss. The company runs critical applications, including billing, CRM and others on AWS, and wanted to be sure they were always available with all data intact.
How cloud reliability is impacted
In dynamic cloud environments, misconfigurations arise from rapid changes to services and apps that frequently are not validated before going live. A related source of misconfiguration is the multiplicity of maintenance teams handling changes at the cloud provider – teams that don’t always have knowledge of changes made by others. All these conditions can often lead to performance disruptions, outages and even data-loss incidents.
CloudZone searches for an automated third-party solution
CloudZone wanted to overcome the obstacles to reliability and data protection. They knew this came down to ensuring there were no misconfigurations in their environment that could lead to disruptions, particularly in many aspects of performance and data protection, but also in areas such as security, compliance, and others. They were searching for a proven solution that continually checks on the status of misconfigurations throughout their AWS environment and then notifies them of the results, provides recommendations and facilitates automated self-healing – in other words, a proactive solution. At the same time, a side benefit of locating the right solution for their own needs would be the addition of another solution they could confidently recommend to their customers.
The solution: Coral™ for AWS
CloudZone was referred to Continuity Software’s built-for-the-cloud Coral™ for AWS solution, used by some of the world’s leading enterprises to ensure resilience and reliability.
How Coral for AWS works
A SaaS solution, Coral addresses the reliability of AWS environments by continually scanning production workloads including AWS services such as virtual machines, containers, networks, load balancers, databases, storage, DNS, and more. The solution collects configuration data (metadata) from AWS via 20 (and growing) AWS native APIs using read-only privileges and employing secure and lightweight data collection. The scans do not and cannot change configurations and no agent is installed.
Used in conjunction with the scans is another key solution component, Continuity Software’s proprietary knowledgebase, which contains 300+ (and growing) rules covering the best practices needed to maintain reliability, protect data, and more. Configuration data collected by the scans are compared against information in the knowledgebase. Deviations from best practices, regulations (where relevant), SLAs, etc., become incident tickets to be repaired. Instructions for repair are provided, including automated self-healing functions to achieve faster response time and decrease operational costs. This proactive process is critical to maintaining continuously-available AWS environments as it prevents performance disruptions and outages and makes sure data is not vulnerable to loss, damage and theft, and is always recoverable.
The solution’s UI shows all identified risks and provides a detailed description of each problem and the recommended steps for resolving it. The dashboard shows an overall Health score and a breakdown of risks by region, urgency, impact, business entity and domain. Below is a sample screen:
CloudZone tests Coral for AWS
In January 2020, CloudZone conducted a 14-day trial of the Coral for AWS solution. Four AWS accounts containing up to 500 nodes deployed in the US-East (N. Virginia) region comprised the scanned environment.
Business services (critical workload) included:
- Monitoring system
- Notification hub
- CRM application
The main AWS services in use were: EC2, S3, ELB, ASG, CloudFront, ECS, Redshift, SQS.
During the test period, 136 configuration risks were uncovered, the majority potentially leading to downtime, data loss and impacting security. The breakdown of risks is seen below: