These general observations, which I drew from the "standing start" tests, are also borne out by the results of the "converged" runs. But a close look at the individual numbers will leave you wondering about consistency.
Some of this may be due to the randomness hidden in the cloud. While the companies make it seem like you're renting a real machine that sits in a box in some secret, undisclosed bunker, the reality is that you're probably getting assigned a thin slice of a box. You're sharing the machine, and that means the other users may or may not affect you. Or maybe it's the hypervisor that's behaving differently. It's hard to know. Your speed can change from minute to minute and from machine to machine, something that usually doesn't happen with the server boxes rolling off the assembly line.
So while there seem to be clear performance differences among the cloud machines, your results could vary. These patterns also emerged:
- Bigger, more expensive machines can be slower. You can pay more and get worse performance. The three Windows Azure machines started with one, two, and eight CPUs and cost 6, 12, and 48 cents per hour, but the more expensive they were, the slower they ran the Avrora test. The same pattern appeared with Google's one CPU and two CPU machines.
- Sometimes bigger pays off. The same Windows Azure machines that ran the Avrora jobs slower sped through the Eclipse benchmark. On the first runs, the eight-CPU machine was more than twice as fast as the one-CPU machine.
- Comparisons can be troublesome. The results table has some holes produced when a particular test failed, some of which are easy to explain. The Windows Azure machines didn't have the right codec for the Batik tests. It didn't come installed with the default version of Java. I probably could have fixed it with a bit of work, but the machines from Amazon and Google didn't need it. (Note: Because Azure balked at the Batik test, the comparative times and costs cited above omit the Batik results for Amazon and Google.)
- Other failures seemed odd. The Tradesoap routine would generate an exception occasionally. This was probably caused by some network failure deep in the OS layer. Or maybe it was something else. The same test would run successfully in different circumstances.
- Adding more CPUs often isn't worth the cost. While Windows Azure's eight-CPU machine was often dramatically faster than its one-CPU machine, it was rarely ever eight times faster — disappointing given that it costs eight times as much. This was even true on the tests that are able to recognize the multiple CPUs and set up multiple threads. In most of the tests the eight CPU machine was just two to four times faster. The one test that stood out was the Sunflow raytracing test, which was able to use all of the compute power given to it.
- The CPU numbers don't always tell the story. While the companies usually double the price when you get a machine with two CPUs and multiply by eight when you get eight CPUs, you can often save money if you don't increase the RAM too. But if you do, don't expect performance to still double. The Google two-CPU machine in these tests was a so-called "highcpu" machine with less RAM than the standard machine. It was often slower than the one-CPU machine. When it was faster, it was often only about 30 percent faster.
- Thread count can also be misleading. While the performance of the Windows Azure machines on the Sunflow benchmark track the number of threads, the same can't be said for the Amazon and Google machines. Amazon's two-CPU instance often went more than twice as fast as the one-CPU machine. On one test, it was almost three times faster. Google's two-CPU machine, on the other hand, went only 20 to 25 percent faster on Sunflow.
- The pricing table can be a good indicator of performance. Google's n1-highcpu-2 machine is about 30 percent more expensive than the n1-standard-1 machine even though it offers twice as much theoretical CPU power. Google probably used performance benchmarks to come up with the prices.
- Burst effects can distort behavior. Some of the cloud machines will speed up for short "bursts." This is sort of a free gift of the extra cycles lying around. If the cloud providers can offer you a temporary speed up, they often do. But beware that the gift will appear and disappear in odd ways. Thus, some of these results may be faster because the machine was bursting.
- The bursting behavior varies. On the Amazon and Google machines, the Eclipse benchmark would speed up by a factor of more than three when using the "converge" option of the benchmark. Windows Azure's eight-CPU machine, on the other hand, wouldn't even double.