Subscribe
About

EMC unveils Hadoop for big data

Patricia Pieterse
By Patricia Pieterse, iWeek assistant editor
Johannesburg, 10 May 2011

EMC's data computing division, Greenplum, has announced an enterprise distribution of the Apache Hadoop open source platform.

“Hadoop is certainly a phenomenon,” says Luke Lonergan, co-founder of Greenplum. “We've seen a lot of enthusiastic discussions about how Hadoop has changed operations at companies like Facebook.”

Hadoop is a software framework that supports data-intensive distributed applications and is effective for analysing and storing massive amounts of data.

“The key aspects of what we are bringing to Hadoop is around innovation that makes it capable of handling enterprise workloads in a very mission-critical environment,” says Lonergan. The Greenplum Hadoop distribution combines Hadoop with Greenplum's database.

EMC's Hadoop distribution family comprises the Greenplum HD Community Edition, the Enterprise Edition and the Data Computing Appliance.

“We're taking pretty typical open source approaches to productise Hadoop,” says Scott Yara, co-founder of Greenplum. “So what we decided is we're going to build a Hadoop distribution, and package that as a traditional open source project. EMC is taking a very proactive open source friendly stance, and saying our key innovations are going to be provided back to the Apache foundation,” he says.

Share