Pivotal HAWQ and Hadoop - dummies

Pivotal HAWQ and Hadoop

By Dirk deRoos

In 2010, EMC and VMware, market leaders in delivering IT as a service via cloud computing, acquired Greenplum Corporation, the folks who had successfully brought the Greenplum MPP Data Warehouse (DW) product to market. Later in 2012, Pivotal Labs, a leading provider of Agile software development services, was also acquired.

Through this federation of companies, the Pivotal HD Enterprise platform was announced in early 2013. This platform, which is integrated with Apache Hadoop, includes the Pivotal HAWQ (Hadoop With Query) product — the former Greenplum MPP DW product.

Though the Pivotal HD Enterprise platform also includes other components and technologies (VMware’s GemFire, for example), divert your attention to the Pivotal HAWQ product, Pivotal’s approach to low-latency interactive SQL queries on Hadoop.

Pivotal has integrated the Greenplum MPP Shared-Nothing DW with Apache Hadoop to enable big data analytics. The Pivotal HAWQ MPP DW stores its data in the Apache HDFS.

Pivotal HAWQ provides ANSI SQL support and enables SQL queries of HBase tables. HAWQ also includes its own set of catalog services instead of using the Hive metastore. The Pivotal HAWQ approach is to provide a highly optimized and fast Hadoop SQL query mechanism on top of Apache Hadoop.