Thursday 27 September 2012

Today, a MHz is just not a MHz any more...

A common trend in virtualization environments is to use the easily-accessible MHz rating of the server as a normalization parameter - so that when you're considering an optimization routine, you can identify available capacity in terms of MHz and compare it to some other capacity being used, and determine whether there's a fit - or not.  While this method of normalization makes complete sense in terms of the data available, I'm here with some bad news.  They just don't make MHz like they used to.  Actually, in many cases, they make them better!  The SPECint2006_rate benchmark is a measure of throughput for CPUs.  This is a direct comparison with MHz, which has a direct correlation on throughput.



Confused about capacity?  Take this example...  How much oil can you get through an oil pipeline is proportional to the cross-section of the pipeline - how fat it is - and the speed at which the oil moves through the pipeline.   Take that into digital context - and the cross-section of a CPU is related to the number of cores, and the speed of the CPU is measured by MHz.  The clock speed is the frequency of the chip - and defines how quickly a task can be processed by the CPU.




They don't make 'em like they used to...

The problem though, is that the clever guys at Intel and the other processor manufacturers don't want to play this game.  They're always thinking of new ways of boosting performance that don't rely on just a MHz improvement.  Take a look at the data.  The chart below shows the ratio of SPECint2006_rate to GHz over the last 6 years, controlling for the number of cores in the benchmark measurement.  This shows that the GHz in 2011 is equivalent to 1.15GHz just 12 months earlier.  Another interesting point, is that AMD data doesn't show this same rate of change - the trend line has a lot lower gradient.  This highlights that the chipset is a hugely important factor when using MHz as a normalization rating.  A MHz just isn't a transferable unit.


Data for HP Proliant only Intel directly from SPEC.ORG 


Conclusion

Normalization is a key part of good capacity management practise.  Using a percentage is simply a recipe for disaster when trying to apply intelligence to configuration optimization.  Using MHz is an easy option, but alas  is just fool's gold.  The data for Intel chips shows that the processing rate for chips changes dramatically over time, and that could introduce errors of over 15% per 12-month period for optimization.  If you were moving from legacy kit, 3 years old, the error margin may be over 75%.  This will always lead to over-specified machines - and a higher spend than necessary to meet business requirements.  Whilst that amount of headroom may have been justified in directly allocated capacity, in the cloud that overspend represents a high cost of ownership and immediate optimization challenges on deployment.

1 comment:

Debbie Sheetz said...

I did a similar analysis to support a paper I wrote for Computer Measurement Group in 2008, Predicting the Relative Performance of CPU. More recently I updated the analysis ...

If MHz were a good predictor of relative performance, all published SPEC ratings for a particular MHz would be similar to each other. But that is not what we found. Even within a single vendor, a single MHz rating doesn’t work out the way you would like.

When comparing SPECInt2006 ratings, all for Intel processors rated at 2000 MHz, there is a difference of 4:1 between the fastest and slowest, which could introduce serious inaccuracy if MHz is used as an indication of CPU capacity. For capacity planning purposes, you need to use something like SPECInt2006 to ensure you are accurately representing the performance characteristics of each system/processor you compare.

This effect is intensifying as each generation of processors is introduced: the MHz have barely gone up over the last 5 years, but the ability to do work on a “per core” basis has increased dramatically. That’s the entire point of the 6, 8, 10, 12, or 16 cores/chip-type of architecture – you get more cores and you get faster cores, too.

The key is to have automated analysis tools that match your current and proposed servers with appropriate benchmarks such as SPECintRate2006 so you can use a meaningful yardstick for determining actual capacity, no matter what tool or technique you are using to make your projections.