Aug

Beyond Virtualization: The problems with Data Centers

If we had said this a few years ago, it might not be so believable, but virtualization technology has become widespread, almost essential in many contexts.

I believe that automation of systems is essential today, more with the servers that provide cloud computing services: the number of machines in data centers continues to increase, so not just automate the creation and management of VMs, we must think of the rest.

The problems associated with virtualization in the data center

In a virtual data center, the operating speed of change has increased. Virtual machines are reconfigured, loads of computing resources are moved, and applications grow and shrink rapidly. We know that the continuous changes increase the risk of errors, analysts estimate that 60/80 percent of data center problems are caused by mismanagement.

How can we ensure the stability of data center, maximizing the advantage of the flexibility offered by virtualization?

Virtualization promises to improve the operation of data centers and no doubt it does. The server consolidation provides significant benefits. The ability to migrate without blocking loads significantly facilitates the management of the hardware. The ability to deploy new virtual machines in a very short time compared to physical machines makes it faster and more effective development and deployment of applications.

The benefits of virtualization, however, bear some costs associated with it. The hypervisor adds another layer of complexity to the stack software. Imposes requirements on the servers, the storage system and especially on the network. While the hypervisor provides a little ‘automation to simplify server hosting operations, the environment around the virtual cluster has made it easier. In a recent survey conducted among customers, 70% said that virtualization adds additional pressure on network operations.

It is easy to understand the origin of this pressure. Each initiative is surrounded by virtual physical resources:

Storage systems.
Users, workstations and networks of partners.
Load balancers and security devices.
Tools for remote administration of servers.
Physical servers.
Hypervisor competition that are not compatible.
Private clouds, lab systems, and other specialized clusters.

The boundary between each of these elements is the virtual environment where mistakes can happen while operating. Both boundaries can be the cause: the configuration of the hypervisor may be incorrect, or the environment outside might be set incorrectly. When there is a performance issue, the information from both sides of the border must be integrated to find a solution. When new applications are implemented, both sides must be pre-approved. Errors and inconsistencies occur in three different ways: in the form of application performance problems, delays in the operational procedures and activities that waste staff time. Each data center has its own unique path, here are some examples.

What are the main problems?

Application performance becomes poor or discontinuous

The parameters of access to ports and the network cannot match. There are many parameters that affect performance, including the port duplex mode, network QoS settings, access lists, firewalls and more.

Some “rogue devices” may be connected to the network with IP protocol settings that are incorrect or improper devices that disrupt production.
Configurations that “deviate” from the best practices, every time the manual procedures are followed incorrectly or when standards are incomplete. Consequently, new and older devices have very different settings, resulting in unpredictable performance.

Requests for changes are taking too long:

When you migrate a virtual server for upgrades or maintenance, its destination must have the correct network settings. A set-up of manual port delays, especially when compared to the almost instantaneous speed of the hot virtual migration.

When created, updated or tested with a disaster recovery site, its network settings must be verified to match up with the master site. A manual verification leads to delays.

When you add new servers to expand a system of load balancing, many devices, including the physical switch, firewall and load balancer may require meticulous rolling upgrades. The manual configuration adds delays, typically takes a much higher time to run a new virtual server.

The staff wasting time on routine tasks:

Daily activities such as the allocation of IP addresses must be coordinated. It can be difficult to identify errors in an environment of constant change.
The problems often involve troubleshooting logs and alerts that are correlated from multiple sources. With virtual machines, there is often a gap between the physical and virtual systems, in which data need to be matched manually.
If an unauthorized party makes a move or a change, you will need to waste time in re-checking the work (or worse, fix the errors)
The reporting and verification of compliance is in itself a nuisance, and virtual systems add complexity.
In a virtual data center, the changes are more complex and occur most often because of the flexibility of virtual machines. The errors become more expensive, and can happen more frequently.

But there is a way to master the complexity and minimize errors, that does not require a complete reorganization of the infrastructure. It is sufficient to optimize the existing infrastructure with automation. If a platform configuration management can be integrated into the network of data centers, it can run automated procedures, all the problems listed above can be solved. An automated platform configurations can be equipped with a “gold standard” for all the items on the perimeter of the virtual system. Deviations from these standards are due to rogue or misconfigured devices, can be prevented, repaired or isolated. The gold configurations can be applied in a single pass, resulting in a rapid and effective response to change requests. The troubleshooting process can be accelerated when the data from physical systems is correlated with the data of the virtual systems.

Authorization rules and delegation can block unapproved changes and check those approved rules.

Automation is needed in the network around the hypervisor to realize the full benefits of virtual systems. A network platform residing in the data center management and automation can minimize errors, promote flexibility, and cut the hidden costs of virtualization.