Triple and Graph Stores in NoSQL

By Adam Fowler

Although it’s just now becoming prominent, the concept of triples has been around since 1998, thanks to the World Wide Web Consortium (W3C) and Sir Tim Berners‐Lee. if you’re experienced with LinkedIn or Facebook, you’re probably familiar with the term social graph. Under the hood of these approaches is a simple concept: every fact (or more correctly, assertion) is described as a triple of subject, predicate, and object:

  • A subject is the thing you’re describing. It has a unique ID called an IRI. It may also have a type, which could be a physical object (like a person) or a concept (like a meeting).

  • A predicate is the property or relationship belonging to the subject. This again is a unique IRI that is used for all subjects with this property.

  • An object is the intrinsic value of a property (such as integer or Boolean, text) or another subject IRI for the target of a relationship.

The figure illustrates a single subject, predicate, object triple.

image0.jpg

Therefore, Adam likes Cheese is a triple. You can model this data more descriptively, as shown here:

AdamFowler is_a Person
AdamFowler likes Cheese
Cheese is_a Foodstuff

More accurately, though, such triple information is conveyed with full IRI information in a format such as Turtle, like this:

<http://www.mydomain.org/people#AdamFowler> a <http://www.mydomain.org/rdftypes#Person> .
<http://www.mydomain.org/people#AdamFowler> <http://www.mydomain.org/predicates#likes> <http://www.mydomain.org/foodstuffs#Cheese> .
<http://www.mydomain.org/foodstuffs#Cheese> a <http://www.mydomain.org/rdftypes#Foodstuff> . 

The full Turtle example shows a set of patterns in a single information domain for the URIs of RDF types, people, relationships, and foodstuffs. A single information domain is referred to as an ontology. Multiple ontologies can coexist in the same triple store.

It’s even possible for the same subject to have multiple IRIs, with a sameAs triple asserting that both subjects are equivalent.

You can quickly build this simple data structure into a web of facts, which is called a directed graph in computer science. You could be a friend_of Jon Williams or married_to Wendy Fowler. Wendy Fowler may or may not have a knows relationship with Jon Williams.

These directed graphs can contain complex and changing webs of relationships, or triples. Being able to store and query them efficiently, either on their own or as part of a larger multi‐data structure application, is very useful for solving particular data storage and analytics problems.

The figure shows an example of a complex web of interrelated facts.

image1.jpg

Think of graph stores as a subset of triple stores that are optimized for queries of relationships, rather than just the individual assertions, or facts, themselves.

Graph math is complex and specialized and may not be required in all situations where storing triples are required.

If you need to store facts, dynamically changing relationships, or provenance information, then consider a triple store. If you need to know statistics about the graph (such as how many degrees of separation are between two subjects or how many third level social connections a person has), then you should consider a graph store.