News

Commodity hardware aiding data warehouse appliance performance, costs

Jeff Kelly, News Editor

Data warehouse appliances -- prepackaged bundles of analytic software with server and storage hardware -- are growing more powerful by the day. Netezza's latest appliance, called TwinFin, for example, is capable of scaling to nearly a petabyte of data, according to the company.

But instead of charging higher and higher prices as analytic power increases, data warehouse appliance vendors are actually lowering their prices. TwinFin goes for just $20,000 per terabyte, which one industry analyst described as the "lowest price point on the market."

So how are data warehouse appliance vendors able to improve performance and lower costs at the same time? They are increasingly switching from expensive proprietary hardware to low-cost commodity server and storage technology from the likes of IBM, HP and Sun, according to analysts.

With TwinFin, Netezza has partnered with IBM to run the appliance on its blade servers. Previous Netezza appliances ran on the vendor's own proprietary hardware, which customers had to pay for, increasing the overall price tag, said Arindam Banerjee, an analyst with Boston-based Yankee Group.

"If you have a lot of blade servers, which normally a lot of people have because it's low cost and it's very good from a processing power perspective, you can just run the application on top of that," Banerjee said. "So decoupling that is very important."

Continue Reading This Article

Enjoy this article as well as all of our content, including E-Guides, news, tips and more.

By leveraging lower-cost hardware, TwinFin is also cheaper to scale out to large data sets, according to Phil Francisco, vice president of product management and marketing at Netezza. Netezza's 10,000 Series, for example, scaled to only about 300 terabytes, less than a third of what TwinFin can scale to, he said.

Scalability is particularly important for NYSE Euronext, a private firm based in New York that runs the New York Stock Exchange, the London International Financial Futures Exchange and several other exchange markets. The company has been a Netezza customer since 2007, using the vendor's 10,000 Series to identify bottlenecks in its computer network that processes financial transactions. NYSE Euronext has been testing the TwinFin appliance for several months and plans to migrate to it in the coming months.

"We maxed out their most current product offering," said Steve Hirsch, chief data officer and senior vice president for global data services at NYSE Euronext, referring to the 10,000 Series. Hirsch said NYSE Euronext stores about a petabyte of data, running between one and one-and-a-half terabytes of that data through the Netezza appliance daily. One query, he said, could examine up to 12 months of data.

Historically, data volumes at NYSE Euronext increase by 60% in its equities business and 100% in its derivatives business year-over-year, Hirsch said, so being able to grow the data warehouse appliance over time is key. TwinFin running on commodity IBM blade servers will make that job a lot easier, he said.

Netezza is not the only data warehouse appliance vendor to embrace commodity hardware to power their applications, according to James Kobielus, an analyst with Cambridge, Mass.-based Forrester Research. He said Greenplum's analytic database can run on multiple hardware platforms, as can Aster Data System's nCluster software and Oracle's Optimized Warehouses, allowing the vendors to keep their prices low.

"Netezza has basically taken their platform and opened it up, so now they can compete on price because they're able to source low-cost commodity components for storage and the like in their architecture," Kobielus said. He estimated Netezza's previous data warehouse appliance offerings, using proprietary hardware, could run as high as $200,000 per terabyte.

"$20,000 [per terabyte] is where you need to be to compete in this market. I suspect in a couple years' time you'll have to be under $15,000, $10,000 per terabyte," Kobielus said. "They've essentially taken a page out of DATAllegro's book."

Since its founding in 2003, DATAllegro has used commodity hardware from HP, Dell, EMC and others to support its data warehouse appliance. The vendor was acquired by Microsoft in 2008.

Both Kobielus and Yankee Group's Banerjee said they expect other data warehouse appliance vendors to follow Netezza's lead and adopt commodity hardware, if for no other reason than to remain competitive on pricing.

"This will be interesting to see how [Netezza's] competition will respond to this," Banerjee said. "These guys have to do it. It's just the way it is," he added, referring to moving to a commodity hardware-based infrastructure.

Indeed, Microsoft's upcoming Project Madison release, a data warehouse appliance offering based partly on acquired DATAllegro technology, will almost certainly rely on DATAllegro's commodity hardware partnerships to keep its price low, Kobielus said.

"By the end of this year, Microsoft will be getting extremely close to putting its SQL Server/Madison appliances out in the field," he said. "From everything I gather from Microsoft, they're going to make that highly cost effective."

Teradata, whose data warehouse appliance with proprietary hardware is considered among the best in the market (but also among the most expensive), may also be forced to go the commodity hardware route, he said.

"This is the chief trend in this market," Kobielus said -- "maximum scalability and maximum affordability in the same appliance-based platform."

Tags: Data warehouse softwareData warehouse project managementVIEW ALL TAGS