In summer, I had an interesting conversation with a colleague of mine, Jörg Möllenkamp. We analyzed performance numbers of IBMs 795 systems IBM published and were sure the numbers were a bit too perfect. Jörg mentioned this in his blog.
So it is time for a quick review of IBMs scaling. They have published their „really-big-iron“ benchmark results in the last weeks.
Let’s have a look:
- IBM 795, 32 CPU, 4 TB RAM: 688630 SAPS (Certificate 2010046)
- IBM 795, 16 CPU, 2 TB RAM: 384330 SAPS (Certificate 2010042)
- for more benchmark results, please click on the certificate links. I will focus on SAPS in this article, if you prefer SD Users please check the certificates
Well, on the first view, both results are very impressive. I have never ever seen a customer who needs such a power „in one piece“. But maybe IBM can find one. Good luck!
But I’m also interested in scaling factors: How much performance can I get if I double the CPUs of the overall system? And this factor includes more than only some CPU power, it also includes overhead for interconnecting, for memory access etc.
Okay… IBM doubled the number of CPUs/cores and… the SAPS went up by 1,79. Is this a good scaling factor?
Well, let’s compare this to other scaling factors. SPARC64 on M9000 with Oracle Database and Intels Nehalem-EX architecture and Oracle RAC. Oracle RAC realizes so called scale out, while large SMP systems like IBMs 795 or the M9000 scale up in the same box. In most cases, scaling out depends on a fast interconnect between the systems, while scaling up depends on a well scalling OS and a fast interconnect between all CPUs of the system. I also included a scale up comparison with Nehalem EX.
So, what are realistic scaling factors
Nehalem EX / Oracle RAC (scale out):
- 8 CPUs: 115300 SAPS (Certificate 2010029)
- 16 CPUs: 221020 SAPS (Certificate 2010039)
- Scaling: 1.92
- Other Oracle RAC Benchmarks with older Intel CPUs show similar scaling factors around 1.9 (Certificates 2009041, 200940, 2009039)
Nehalem EX / MS SQL (scale up)
- 4 CPU: 57270 SAPS (Certificate 2010032)
- 8 CPU: 101720 SAPS (Certificate 2010040)
- Scaling: 1.78
SPARC64VII / Oracle 10g (scale up)
- 32 CPU: 95480 SAPS (Certificate 2009038)
- 64CPU:175600 SAPS (Certifiacte 2009046)
- Scaling: 1.84
Power7 / DB2
- 16 CPU: 384330 SAPS (Certificate 2010042)
- 32 CPU: 688630 SAPS (Certificate 2010046)
- Scaling: 1.79
In my opinion, this shows two important facts:
- Scale out and scale up are never perfect. You will always get some „penalty“ for having more than one simultaneous thread running.
- The mean value in my little, no representative calculation is around 1.82 – 1.83. So being above this value suggests a good scaling factor.
There also a little indication that scaling out performed better on x86 than scaling up. I don’t want to say it is better on RISC, because I did not found benchmark results using RISC and RAC at once.
At least, this figures show one thing we have expected: IBM, the perfect linear scaling you claimed in summer is … not reached yet. Maybe you should start to build a better system for your world class CPU.
Your conclusion is true, but you did not mention, that scaling becomes more complicated with the number of cpu’s and the speed of the individual cpu. So the SPARC Scale from 32 to 64 is very impressive, but since the per cpu sap is approximately one fourth of a nehalem, it’s easier to achieve. The technology for fast interconnections exists and you don’t use a slower technology just because the cpu is slower.
The better factor on scaling out than scaling up looks a bit like something has been added with the second machine. This might be memory bandwidth or io bandwidth.