Capacity Management in the cloud depends on knowing the headroom for growth, the capacity to handle new business, and leveraging the operational budget to efficiently deliver services. And whereas most capacity managers first focus is on the utilization of critical compute assets, the needs of the cloud enterprise are more accurately served by understanding the un-utilization rate, that is – the amount of capacity that is available to accommodate new business services or growing demands. And, while looking at utilization in terms of a sheer percentage provides some insight, understanding that capacity in a heterogeneous environment is not uniform is a fundamental consideration of the un-utilization rate. We can think of this new KPI in 3 ways:
1) The ‘smoothed peak’ utilization value (commonly expressed as a 95 or 98 percentile). This represents the maximum available capacity whilst disregarding the occasional spike that every component is bound to experience in normal operations.
2) The amount of capacity available, and normalised into a common unit. For example, with network cards you may encounter 100Mbps, 1Gbps, 10Gbps hardware. For memory, you may encounter 32Gb, 64Gb, 128Gb. For CPU, you certainly need a model approach to account for the variety of chip architectures, scalability parameters and likely normalised against a common benchmark or set of benchmarks, such as SPEC_int.
3) The maximum amount of capacity that can be used without performance degradation. For this, you either need a modelling approach or a wide set of engineering benchmarks that can be applied against all components in the estate. For example, a common benchmark for ethernet is that performance degradation begins at around 40%. For CPU, depending on the type of processor, you could use a maximum desired utilization threshold of 90%. The alternative is to apply a model approach that can represent different workloads. This allows you to distinguish between a batch workload – where CPU utilization of 100% is often desired – and a micro-transactional workload running on windows – where context switch overhead can become significant over 75%.
Applying these three aspects of capacity to our un-utilization rate, we can see that:
Unutilization rate = (max desired capacity – smoothed peak utilization rate) * common units
In the cloud environment, two further effects should now be considered.
1) The temporal nature of workload, and that workload balancing requires that a macro-view of commodity capacity unutilization rate provides a single view to show the available headroom. This means that macro, platform and resource pool views are required to ensure a holistic view to capacity management.
2) The combination effects of varying workloads mean that individual workloads are difficult to break down. In a cloud environment, capacity is made available quickly and dynamically to allow business users to adjust to varying levels of demand. But from a provider’s perspective, as one workload waxes and another wanes, the change in the amount of underlying capacity can change independently of individual workloads, but rather depends on the combination of workloads and trends. For this reason, advanced forecasting should be made through regression analysis of combined workloads and highly dependent also on business growth forecasts.
Using these effects then predictive analytics can be applied based on perceived change of organic growth and in a highly consultative manner with the business owner. In a private cloud, the business owner is the CIO – the position of accountability for running the cloud at a profit, and ensuring sufficient capacity to all customers.
Finally, our recommendation is that all cloud providers tightly correlate their financial management controls with capacity management routines, in order to understand the cost of capacity provision and the efficacy of the leveraging of those investments. This is where the unutilization rate vitally adds significant business value. By taking a polarized view of capacity in sheer financial terms, the unutilization rate provides both the ability to safely downsize capacity and reduce variable operational expense – and also to carefully and deliberately manage the capacity to accommodate new and growing workloads as the cloud business expands.