November 20, 2011

Rapidly Emerging Technology Series: Database Appliances

The Rapidly Emerging Technology Series highlights current technologies that are relevant to data warehouse professionals.  This posting discusses database appliances.

A database appliance is an integrated, preconfigured package of RDBMS software and hardware.  Most major database vendors including Microsoft, Oracle, IBM, and TeraData package and sell database appliances.  Data warehouse appliances are the biggest selling database appliances.

Database systems utilize memory, I/O, processing cores, and storage for database processes, and they need to formulate execution plans that will utilize these resources efficiently.  Hardware configurations for database performance—particularly data warehousing—are not necessarily the same as configurations for other purposes.  In fact, sometimes database performance isn’t even considered when purchasing and configuring hardware.  In those situations, even the most experienced DBA's and systems administrators aren't always able to optimize systems to get satisfactory performance.

A database appliance is a pre-configured hardware and software solution.  Most database appliances are designed for specialized applications such as OLTP or data warehousing. The servers, storage, OS, and RDBMS software are integrated and optimized for performance.

Some database appliances utilize parallel processing to distribute workloads across server nodes. Multi-node systems can be share-everything allowing multiple servers to share storage, or share-nothing where each server has its own storage.  Share-everything systems tend to be more expensive yet allow the same data to be accessed by several servers. Share-nothing systems can distribute data from the same tables across multiple nodes so that queries can be processed in parallel.  A share-nothing system is useful for querying very large fact tables in a data warehouse.

Database appliances generally do not scale well outside of the initial configuration.  For example, you generally don’t add storage to a database appliance.  Data warehouse appliances are available to support from about 5 terabytes to 100’s of terabytes of data.

Database appliances can also be very costly.  In many situations, it may be possible to get satisfactory database performance with much less expensive hardware purchases.


Wikipedia article on Data Warehouse Appliances: http://en.wikipedia.org/wiki/Data_warehouse_appliance

1 comment:

  1. Amazon Redshift is a cloud-based data warehouse service that allows you to store and process your data in a scalable and flexible manner. You can store your data in the form of tables, and then use SQL queries to run queries against

    ReplyDelete