It doesn’t matter if your organization adopts a cloud-native approach to IT or you run your business on mainframes – Configuration management remains critical to your IT operations. Configuration management is about controlling the state of all of the systems and applications in your environment. I call it the metadata of your IT operations. While data replication helps to protect from data loss, configuration management remains as essential as data replication.
Over the years, I’ve witnessed a lack of mature configuration management operations result in failures ranging from failed DR invocations to unplanned production outages.
Beyond Data Replication
One of the biggest myths in DR readiness? Application data is the most critical part of a disaster recovery process. No doubt that application data is vital. Imagine a scenario that you’ve replicated all of your ERP data to your redundant Tier 1 storage array or public cloud provider. However, you haven’t documented the application configuration in over a year or even a few months. Chances are there have been dozens of changes to the system over that period - Enough modifications that your application team must re-install the application in DR.
The impact on your recovery time objective (RTO) can’t be overstated. During regular operations, it can take the SAP Basis team weeks to install and test the base application before releasing it to business users. What’s the additional risk to the business if your organization is reliant on outside consultants to install your ERP platform. The resource constraints could result in days of additional downtime. ERP is just one of many mission-critical application and services protected by data replication.
Some pushback I’ve heard is to replicate the entire machine state alongside the application data. In a perfect scenario, every application runs inside a virtual machine, or the configuration state resides in the application data. Thus, the configuration state is replicated as part of the data replication.
Impact of Technical Debt
Most organization deal with a combination of technical debt alongside modern application designs. Older architectures such as mainframes and legacy Unix don't provide an option for virtualization-based replication. There are also dependencies such as VPN connections, 3rd party service connections, SaaS, IaaS, and load balancer configurations. These are all systems that change over a period.
When looking for tooling to get configuration management under control, think beyond the technology. Understand what you want to accomplish and design a high-level process that can help in narrowing down your tooling architecture. Here are some first steps to get you started.
Step one to getting your configuration management under control - IT organizations must get a handle on what applications, infrastructure, and services comprise their operations. You must know your inventory. Remember, think process over technology. Today, more than ever, the configuration items in your environment may not be a physical server or virtual machine. The configuration item may be a Cloud Formation template.
Step two is to understand the dependencies between these subsystems. What’s the value of restoring an ERP application to a DR location if users can’t access the system due to a missed VPN connection between the DR site and your authentication service provider?
Step three is to implement a system to understand the state of these subsystems. An organization must document and track the configuration and changes to these systems over a period. In many cases, an outage or data loss may be a result of many small changes over a period versus the last change made.
Last, make sure the conversation goes beyond just your development team or operations team. The discussion should span both application development and infrastructure. A failing I’ve seen in most organizations is a lack of coordination between application configuration management and infrastructure configuration management. Both disciplines are intertwined.