By Adam Fowler

NoSQL databases are well suited to very large datasets. Bigtable clones like HBase are no exception. You’ll likely want to use several inexpensive commodity servers in a single cluster rather than one very powerful machine. This is because you can get overall better performance per dollar by using many commodity servers, rather than a vastly more costly single, powerful server.

In addition to being able to scale up quickly, inexpensive commodity servers can also make your database service more resilient and thus help avoid hardware failures. This is because you have other servers to take over the service if a single server’s motherboard fails. This is not the case with a single large server.

The figure shows a highly available HBase configuration with an example of data split among servers.

image0.jpg

The diagram shows two nodes (HRegionServers) in a highly available setup, each acting as a backup for the other.

In many production setups, you may want at least three nodes for high availability to ensure two server failures close in time to one another can be handled. This isn’t as rare as you’d think! Advice varies per Bigtable; for example, HBase recommends five nodes as a minimum for a cluster:

  • Each region server manages its own set of keys.

    Designing a row key‐allocation strategy is important because it dictates how the load is spread across the cluster.

  • |Each region maintains its own write log and in‐memory store.

    In HBase, all data is written to an in‐memory store, and later this store is flushed to disk. On disk, these stores are called store files.

    HBase interprets store files as single files, but in reality, they’re distributed in chunks across a Hadoop Distributed File System (HDFS). This provides for high ingest and retrieval speed because all large I/O operations are spread across many machines.

To maximize data availability, by default, Hadoop maintains three copies of each data file. Large installations have

  • A primary copy

  • A replica within the same rack

  • Another replica in a different rack

Prior to Hadoop 2.0, Namenodes could not be made highly available. These maintained a list of all active servers in the cluster. They were, therefore, a single point of failure. Since Hadoop 2.0, this limit no longer exists.