NoSQL For Dummies Cheat Sheet
As a NoSQL developer, selecting the right product category and the right product is the first step. These guides compare the most important features in some of the most popular NoSQL databases.
Bigtable/Wide Column Store Features in NoSQL Databases
Bigtables clones are a type of NoSQL database that emerged from Google’s seminal Bigtable paper. Bigtables are a highly distributed way to manage tabular data. These tables of data are not related to each other like they would be in a traditional Relational Database Management System (RDBMS). Here are the most important features from popular database choices.
|ACID or BASE||ACID||BASE||BASE||ACID|
|HA Replicas||Yes, Sync||Yes, Async||Yes, Sync||TBD|
|DR Replicas||As HDFS||Yes, Asyn||As HDFS||TBD|
|Data types||No data type support.||Yes, schema must be defined up front.||No data type support.||No data type support.|
|Data indexing||No secondary indexing.||Not a true “secondary index” feature — only
allows columns to be used in queries — doesn’t speed up
Supports Bloom filters.
Supports Bloom filters.
|Full secondary indexes.|
|Query and search||Uses Map/Reduce for accessing data.||CQL query language similar to SQL.||Uses Map/Reduce for accessing data. Can be used with Hive query
|Value exact match and string “starts with” queries.
Column exists query term support. No range.
|Commercials||Apache 2. Used in government for secure Bigtable needs.||Commercial version from DataStax.||Apache 2. Available from a number of Hadoop providers.||GPL v3 licensed.|
|Other||Role based access control (RBAC) and cell (per value) level
security useful for government use cases.
Custom authentication and authorization plug-ins available. Partial
encryption at rest of data in Accumulo 1.6. (Intermediate recovery
files not encrypted.)
|0.5–1.0TB of data recommended per node. SSD storage
recommended. 32GB RAM and 4/8 cores recommended.
Recommended AWS system for 1TB of data is 2.2xlarge (60GB RAM + SSD
storage), or smaller c3.2large for 100GB of data.
Support for encryption of data at rest (but not journal logs).
|Viewed as the slower of the Hadoop-based NoSQL databases.
“Endpoints” provide functionality similar to stored
|Adaptive memory allocation feature automatically tunes RAM
usage for write-heavy and read-heavy applications.
Key-Value Store NoSQL Database Features
Key-value stores are no frills NoSQL databases that generally delegate all value-handling to the application code itself. These are the key features of common key-value store databases.
Document NoSQL Database Features
Document NoSQL databases are flexible and schema agnostic, which means you can load any type of document without the database needing to know the document’s structure up front. Document NoSQL databases support these important features.
|Feature Area||Couchbase||Microsoft DocumentDB||MarkLogic Server||MongoDB|
|ACID or BASE||BASE||BASE, client driver consistency selection||ACID, fully serializable||BASE, client driver consistency selection|
|HA Replicas||No||Managed by Azure platform.||Yes, Sync||Yes, Async (default)|
|DR Replicas||Yes, master-master, Async||Managed by Azure platform.||Yes, Async||Yes, Async|
|Data types||JSON document model||JSON document model. Same types supported as JSON —
String, numbers (IEEE754), and Booleans. Extended date-time, guid,
Int64 types supported.
|XML, JSON, text, and binary documents supported. All W3C XML
schema data types supported.
|JSON document model. Same types as JSON. Support for 2D
|Data indexing||Secondary indexes supported. Views supported. No universal
index. Indexes updated asynchronously.
|Universal index for all JSON documents. Universal index
includes automatic range index detection. Indexes eventually
consistent, by default.
|Universal index for all text, XML, and JSON documents. Views
not supported. Supports range indexes. Indexes updated within the
ACID transaction. Geospatial 2D indexes.
|No universal index. Secondary indexes configurable on named
|Query and search||Memcached API fully supported. Queries over documents and views
|Uses SQL over HTTP for queries. No free text search grammar
support. Projection and range queries supported.
|Free text (similar to Google search box) search grammar and
structured queries both supported. Range queries supported.
Aggregates can be calculated during a search. Geospatial queries
|Custom JSON query format with support for range queries. No
free text search grammar support. Text and Geospatial (GeoJSON)
|Commercials||Commercial-only model. Provided only on Microsoft’s Azure
|Commercial-only model.||AGPL licensed. Commercial licenses available.|
|Other||Microsoft’s Azure platform hides many of the complexities
of scaling out a large database across multiple geographies.
|Provides meetups at some MarkLogic offices worldwide.
Document-level security model implemented.
|Strong support for local meetups at many MongoDB offices
worldwide. 10 official and 32 community client drivers.
Triple Store and Graph NoSQL Database Features
You can use a triple store or graph NoSQL database if you have a web of interconnected data, or you can simply tag your data and infer relationships according to the records that share the same tags. These database products support these important features.
|Feature Area||AllegroGraph||MarkLogic Server||Neo4j||OrientDB|
|ACID or BASE||ACID, fully serializable||ACID, fully serializable||ACID, read committed||ACID, fully serializable or read committed|
|HA Replicas||No||Yes, Sync||No||Yes, Sync|
|DR Replicas||Yes, Async||Yes, Async||Yes, Sync (when available)||TBD|
Supports integers, unsigned integers, floating point, decimals,
and time and dates.
|JSON, binary, XML, free text storage supported. All W3C RDF and
XML schema types supported.
|Java data types supported.||JSON, binary, and RDF storage supported.|
|Data indexing||Triple indexes optimized for graph style queries. 7 SPOGI
|Triple index optimized for known depth triple store style
queries. 4 SPOGI indexes.
|Triple indexes optimized for graph style queries (shortest
path, subgraph, and so on). 7 SPOGI indexes.
|Has own triple index. Optimized for triple store style
|Query and search||SPARQL 1.0 and 1.1 supported. SPARQL Inferencing Notation
(SPIN) API supported.
|SPARQL 1.0 compliance, SPARQL 1.1 partial compliance (will be
nearly compliant in upcoming version 8). Inferencing support in
|Cypher query language provided, resembling SQL. No standards
support. Shortest path, Dijkstra, and A* graph algorithms
|No W3C SPARQL or GraphStore protocol support for storing or
querying RDF data. Has own query language.
|Commercials||Commercial-only model. Available from Franz, Inc. Free version
available limited to 5 million triples. Developer version available
limited to 50 million triples.
|Commercial-only model. Entry level “Essential
Enterprise” edition for small clusters, and “Global
Enterprise” for large clusters.
|Provided under AGPL. Commercial license available. Discounted
start-up license available.
|Favorable commercial terms available for startups. Commercial
support available for Apache 2 licensed edition, although feature
limited. All features are available only in commercial
|Other||Triple-level security supported. Online backups with
point-in-time recovery supported. CLIF++ and RDFS++ supported.
Includes a Social Network Analysis (SNA) library.
|Record-level (Graph) security support. Provides meetups at some
MarkLogic offices worldwide.
|Neo Technologies recommend SSDs for good performance.||Record-level (Graph) security support.|