www.fgks.org   »   [go: up one dir, main page]

Amazon bests Microsoft, all other contenders in cloud storage test

Amazon bests Microsoft, all other contenders in cloud storage test

Amazon's S3 Simple Storage Service has outperformed Microsoft's Windows Azure Storage and all other major providers in an extensive study testing the feasibility of businesses using cloud services for primary storage, data protection, and disaster recovery.

Nasuni, which sells data protection services that work across any type of cloud storage, says it has been testing the 16 largest cloud storage providers (CSPs) since April 2009 to determine the best services for its customers. Ultimately, only six of the 16 providers passed Nasuni's testing—in addition to Amazon and Microsoft, the other winners were Nirvanix, Rackspace, AT&T Synaptic, and Peer1 Hosting. Both AT&T and Peer1 use EMC's Atmos platform on the back end, although EMC itself discontinued its own public cloud based on Atmos.

While these six are, apparently, ready for real-world use, Nasuni politely declined to say which ten services failed its test, so we can't warn you away from those vendors. But Nasuni does say the difference between the ones who passed the tests and those that didn't is in some cases quite large. When Nasuni tested the providers for scalability by continuously writing small files of 1KB for weeks on end to determine error rates and performance, two of the eight providers that made it through this stage of testing failed, and others couldn't complete the test.

"Without proper testing, it is impossible to differentiate between an industrial-strength CSP and a lesser operation," Nasuni said. "In fact, some providers have asked Nasuni to cease testing at this stage because they said it was negatively impacting their customers, which is a truly frightening statement. True cloud storage should be able to accommodate billions of files without any visible strain. Those CSPs that faced performance issues under Nasuni’s test are simply not equipped to deliver an appropriate level of service to customers."

As in most cases, Amazon S3 and Microsoft's Azure were the top two performers in the scalability test, with Amazon's error rate at "effectively zero" and Microsoft close to it.

The tests were divided into five categories. Nasuni began by looking at API integration, "to ensure that it is possible to test the service at all." More categories included unit testing to determine whether each CSP can handle basic functions like writing and reading different file sizes; performance testing to measure response time across multiple simultaneous threads and a range of object sizes and workload types; stability testing to determine long-term reliability by continuously writing and reading files to ensure all data is preserved; and the scalability testing we mentioned earlier in this article.

Nasuni tried to make the tests fair, for example by testing stability from numerous locations. In the case of stability testing, providers "had to perform with no data loss and have no significant unplanned outages" in order to pass.

"Two CSPs emerged as top performers in the Nasuni study: Amazon S3 and Microsoft Windows Azure, with Amazon S3 being the standout across all evaluation areas," Nasuni said. Other vendors performed well in specific areas. "Though Nirvanix was 17 percent faster than Amazon S3 for reading large files, and Microsoft Azure was 12 percent faster when it comes to writing files, no other vendor posted the kind of consistently fast service across all file types as did Amazon S3."

"Amazon S3 had the fewest outages and best uptime, and was the only CSP to post a 0.0 percent error rate in both writing and reading objects during scalability testing," Nasuni continued. "And though Microsoft Azure had a slightly faster average ping time than Amazon S3 (likely because Amazon S3 is much more heavily used than Microsoft Azure), Amazon nevertheless had the lowest variability."

Amazon customers suffered a severe outage during April in the separate Elastic Block Store service, which provides mountable disk volumes to virtual machines hosted in Amazon's Elastic Compute Cloud. However, the S3 storage service tested by Nasuni posted excellent uptime. Amazon had the fewest outages at 1.4 per month, but they were so small that S3's availability was essentially 100 percent, according to Nasuni.

Azure had 11.1 outages per month, with availability at 99.9 percent, while Rackspace and Peer1 achieved almost identical numbers. Nirvanix had 332 outages per month, but the outages must have been small ones as Nirvanix's overall availability was still 99.8 percent. AT&T, with 10.4 outages per month, posted the worst uptime because of their duration. AT&T's availability was 99.5 percent.

The full Nasuni benchmark report is here if you want to dive into it a bit more. Coincidentally, Microsoft today released an update to Azure that may be interesting to developers, which is detailed on the Azure blog. We also noted a couple months ago that in a more limited test conducted by the vendor Compuware, Azure came out ahead of Amazon in terms of speed.

Because cloud services are publicly available, they provide a good opportunity for organizations interested in benchmarking. The Nasuni test does appear to be among the most comprehensive on the subject, Enterprise Strategy Group founder and senior analyst Steve Duplessie tells Ars. "We happen to know that Nasuni has been collecting a lot of data on all these guys for a long time, so we're confident with the results being accurate," Duplessie says. "The cloud is a fuzzy thing, no pun intended. It's good that someone has been tracking reality."

In conclusion, the Nasuni report states, "It is not difficult to create something that looks like cloud storage. It is very difficult, however, to create a cloud that is truly scalable, reliable and always available."

User comments

After seeing Reddit go up and down, up and down, repeatedly, I'm not so sure I'm terribly impressed by Amazon. But I guess the best in a market that is not very reliable is still a "winner" of sorts.
Useless analysis, as expected. In what universe is 1MB a "large file"? 1MB is the write size where distributed filesystem just begin to hit their stride, so you would need to use something like 1MB writes to a 1GB file to get any meaningful result. As for the performance numbers, they are meaningless. You can write 1MB files to S3 at 2MB per second? Is that on a single stream, or aggregated across some a sharded client? Was the response time normally distributed? What was the 99th percentile write time?

Their test creates 100 million files, and they claim (without any supporting evidence whatsoever) that a cloud should support billions of files without a problem. But they don't test metadata operations. How long would it take to traverse a tree of a billion files on S3 or Azure? Is it even possible? Can you walk the tree in parallel, and if you do, does it take less time, or more?

I don't even want to comment on the rank stupidity of "average outages per month." What the hell does that mean? Would it be 1 if the service was offline all month? When the service is out, is it a total outage, or one shard? Do you lose access to the metadata or the data itself, or both?
wwif wrote:
After seeing Reddit go up and down, up and down, repeatedly, I'm not so sure I'm terribly impressed by Amazon. But I guess the best in a market that is not very reliable is still a "winner" of sorts.

Reddit's downtime is rarely actually Amazon's fault, they just don't have enough actual capacity.
I think Reddit going up and down has more to do with Reddit than it does with Amazon. With the exception of the catastrophic failure in some of their availability zones in North Virginia, I've scaled out a few large scale applications on the Amazon cloud and I haven't had any trouble.
Plus, early on, Reddit didn't setup their cloud properly. Seems like they learned their lessons ..
Great, something those snobs across the pond can't feel superior about!

:D
S3 needs to start supporting CORS.

That is all.
jwbaker wrote:
Useless analysis, as expected. In what universe is 1MB a "large file"? 1MB is the write size where distributed filesystem just begin to hit their stride, so you would need to use something like 1MB writes to a 1GB file to get any meaningful result. As for the performance numbers, they are meaningless. You can write 1MB files to S3 at 2MB per second? Is that on a single stream, or aggregated across some a sharded client? Was the response time normally distributed? What was the 99th percentile write time?


For more details on the reasoning and methodology, see: http://www.nasuni.com/blog/15-testing_t ... part_1-api - that links to the other posts that dive into all of the details of how, and why.
I don't see anything in that multi-part blog post that answered any of my questions, for example long-tail response times and ability to traverse billions of files.
Unfortunately, the study doesn't seem to have collected (or at least not published) what is probably the most important metric for cloud storage: p99 (or p95, or p99.9) latency for reads and writes. The biggest issue using cloud storage for any sort of DB (whether MySQL or NoSQL of some sort) tends to be stability of latency for I/O operations. A fast best-case or even average doesn't mean much if every now and then it suddenly starts taking 30ms for every block read or write.
wwif wrote:
After seeing Reddit go up and down, up and down, repeatedly, I'm not so sure I'm terribly impressed by Amazon. But I guess the best in a market that is not very reliable is still a "winner" of sorts.


Reddit's problems also weren't with the S3 service, which is what this article is talking about. EC2 is Amazon's cloud infrastructure offering (as opposed to cloud storage) and that's where Reddit was having issues (though I agree with others here that Reddit set things up kind of wrong.
ScottTFrazer wrote:
wwif wrote:
After seeing Reddit go up and down, up and down, repeatedly, I'm not so sure I'm terribly impressed by Amazon. But I guess the best in a market that is not very reliable is still a "winner" of sorts.

Reddit's problems also weren't with the S3 service, which is what this article is talking about. EC2 is Amazon's cloud infrastructure offering (as opposed to cloud storage) and that's where Reddit was having issues (though I agree with others here that Reddit set things up kind of wrong.

Oh, right. Thanks for the correction.
As always, we at Nasuni appreciate your commentary. To address some of the questions you've raised, our President, Rob Mason and CEO, Andres Rodriguez put together a short video response. You can find it at the following page: http://www.nasuni.com/how_it_works/reso ... report_q_a

-Louis
I took a look at the link but it was video. Is there a text version around somewhere?
Devin wrote:
I took a look at the link but it was video. Is there a text version around somewhere?


Hi Devin,

We are working on a transcription. I will send you a note once it is uploaded.

Thanks,
Louis
Awesome, thanks. I'm too old and curmudgeonly to watch online video.
Hi Devin,

You can find a transcript on the following page: http://www.nasuni.com/how_it_works/reso ... report_q_a

Thanks!

Louis