Data Warehouse Appliances - dummies

By Thomas C. Hammergren

Like with bell-bottom jeans, hardware-assisted databases are on the comeback trail. Microsoft, Oracle, and Netezza are all the rage at database seminars around the globe. In the mid– to late 1980s, vendors Britton Lee and Teradata (which eventually merged) were all the rage.

They provided dedicated machines that optimized database processing — the first machines used by heavy data consumers, including many of the consumer-packaged goods companies.

The objective of these boxes was to dedicate all aspects of a computer to getting data to the users faster. This dedicated machine included a query-centered database, memory, CPU, and disk operations. Eventually, such products moved out of vogue, and the database management systems were migrated to a more open, run-on-any-box architecture.

Now, they’re back!

A data warehouse appliance is an integrated set of servers, storage, operating system, DBMS, and software specifically pre-installed and pre-optimized for data warehousing. Data warehouse appliances provide solutions for the mid– to large-volume data warehouse market, offering low-cost performance on data volumes in the terabyte to petabyte range (that’s a lot of data!).

Most data warehouse appliance vendors use massively parallel processing (MPP) architectures to provide high query performance and platform scalability. MPP architectures consist of independent processors or servers executing in parallel.

Most MPP architectures implement a shared nothing architecture, in which each server is self-sufficient and controls its own memory and disk. Shared nothing architectures have a proven record for high scalability and little contention.

Most data warehouse appliances distribute data onto dedicated disk storage units connected to each server in the appliance. This distribution allows the appliances to resolve a relational query by scanning data on each server in parallel. The divide-and-conquer approach delivers high performance and scales linearly when you add new servers into the architecture.

And, from a price perspective, most of the vendors in this arena are attempting a strategy of plug-and-play. For instance, Netezza typically sells a new user their product as plug-compatible with Teradata for less than the Teradata maintenance cost. This price point makes the products very attractive, giving them a growing adoption rate.