EMC Delivers Unified Big Data Analytics Appliance
Continuing to build momentum for its “Big Data” analytics solutions, EMC Corporation (NYSE: EMC) introduced the EMC® Greenplum® Modular Data Computing Appliance (DCA). The Greenplum DCA features a modular architecture that for the first time allows enterprises to combine a shared-nothing MPP relational database with enterprise-class Apache Hadoop – along with Greenplum partner BI and ELT applications – to achieve true structured and unstructured data co-processing, and expand it gracefully as needed in a single, unified platform.
DCA modules enable enterprises to completely change how they think about scaling by providing the ability to start small and expand the appliance in more flexible and cost-effective quarter-rack increments based on processing performance or storage capacity needs. In addition to mixing and matching Greenplum Database and Greenplum HD (Hadoop) modules, enterprises can also bring all their BI applications and ELT tools directly into the cluster, in the same DCA, through the use of new Greenplum Data Integration Accelerator modules. The result is a unified Big Data platform combining structured and unstructured data and applications in a single infrastructure that is also monitored, managed and supported by EMC.
Today, enterprises are seeking to make better use of their data warehouses for advanced analytics, and this trend will accelerate as organizations strive to move from running business intelligence point solutions to deploying comprehensive analytics enterprise-wide. At the same time, enterprises are placing greater importance on integrating and analyzing their accumulation of their unstructured and semi-structured data. But as data warehouses get bigger, enterprises are facing scalability, performance degradation and management complexity, and are seeking ways to enable more concurrent users to access the data for business applications.
Four Greenplum Data Computing Appliance modules are available today:
- The Greenplum Database Module is a purpose-built, highly scalable data-warehousing appliance module that architecturally integrates database, computing, storage and network into an enterprise-class, easy-to-implement system. It is the industry leader in price and performance.
- The Greenplum Database High Capacity Module is designed to host multi-petabytes of data without surging power consumption, increasing costs or mushrooming space. For businesses that require detailed analysis of extremely large amounts of data or those looking for a longer term archive, this high-capacity version offers the lowest cost-per-unit data warehouse.
- The Greenplum HD Module is the world’s first high-performance data co-processing Hadoop appliance module. It marries Hadoop with the Greenplum Database, allowing true co-processing of both structured and unstructured data within a single, seamless solution.
- The Greenplum Data Integration Accelerator (DIA) Module hosts partner analytics applications and places them directly on the same high performance, low latency interconnect as the other appliance modules. This enables market-leading data loading performance in a parallel and scalable model, to shorten batch loads or implement micro-batch loading.
Greenplum’s unique approach to Big Data analytics is resonating with a growing number of large enterprise customers in the financial, healthcare, and media industries who are planning for the future and have adopted our technology to make fundamental changes in their overall data infrastructure in order to tackle these Big Data issues. Today’s announcement reflects our shift to a focus on building the analytics platform of the future, and we have had great success with that as enterprises adopt and validate our vision of data coprocessing.”– Bill Cook, president and general manager of Greenplum, a division of EMC
Enterprises can start with a single, primary rack, which includes a single standard or high-capacity Greenplum Database quarter-rack module and room for three additional modules, as well as two master servers that are responsible for authentication, optimizing the query, balancing the workload among the different segment servers, managing the fault tolerant mechanism of data and other tasks for the cluster. Enterprises may expand the appliance in quarter-rack increments using Greenplum Database, Greenplum HD or Greenplum DIA modules in any order and amount, up to six racks total as their demand for processing capacity grows. All modules are linked via a redundant, high-performance, low-latency interconnect.
This new release of the Greenplum DCA also includes increased high availability and simple integration with EMC’s market-leading solutions for data protection and disaster recovery. High availability is addressed via automated master-node fail-over and the Greenplum Database High Availability Group, enabling each full-rack DCA to sustain up to four server failures, one in each HA Group, nearly doubling the high availability rate. It also integrates with industry-leading EMC Data Domain® deduplication and backup technology from high speed backup and restore and wide area disaster recovery. The Greenplum DCA SAN Mirror Solution uses EMC Symmetrix® VMAX™, TimeFinder®/Snap and Symmetrix Remote Data Facility (SRDF®) for advanced storage and data replication between two sites in synchronous mode.