Extending Your Data Layer with NoSQL

By Adam Fowler

A database does one thing very well: It stores data. However, because all applications need additional software to be complete, it’s worth ensuring that your selected NoSQL database has the tools and partner software that provide the extended functionality you require.

Not ensuring that extended functionality is supported will mean you will end up installing several NoSQL databases at your organization. This means additional cost in terms of support, training and infrastructure. It’s better to be sure you select a NoSQL database that can meet the scope of your goals, either through its own features or through a limited number of partner ­software products.

The ability to extend NoSQL databases varies greatly. In fact, you might think that open‐source software is easy to extend; however, just because its API is public, doesn’t mean it’s documented well enough to extend.

Whether you select open‐source or commercial software, be sure the ­developer documentation and training are first rate. You may find, for ­example, that commercial software vendors have clearer and more detailed published API documentation, and well‐documented partner applications from which you can buy compatible software and support.

These software extensions can be anything useful to your business, but ­typically they are on either the ingest side or the information analysis side of data management rather than purely about storage. For example, extract, transform, and load (ETL) tools from the relational database world are being slowly (slowly) updated for NoSQL databases. Also partner end user applications are emerging with native connectors. The Tableau Business Intelligence (BI) tool, for example, includes native connectors for NoSQL databases.

Ingestion connectors to take information from Twitter, SharePoint, virtual file systems, and combine this data may be useful. Your own organization’s data can be combined with reference data from open data systems (for example, data.gov, data.gov.uk, geonames, and dbpedia websites). These systems typically use XML, JSON or RDF as open data formats, facilitating easier data sharing.

Integration with legacy apps is always a problem. How do you display your geospatially enriched documents within a GIS tool? It’s tricky. Open standards are key to this integration and are already widely supported. Examples are GeoJSON, OGC WFS, and WMS mapping query connectors.

File‐based applications are always a bit of a problem. It’s a logical next step to present a document database as a file system. Many NoSQL databases support the old and clunky WebDAV protocol. Alas, as of yet, no file system driver has become prevalent. Some NoSQL databases are bound to go this way, though.

Ask your NoSQL vendors about their supported partner applications and extensions. These may cost less than building an extended solution yourself, or paying for vendors’ professional services.