Advertisement
Online Test Banks
Score higher
See Online Test Banks
eLearning
Learning anything is easy
Browse Online Courses
Mobile Apps
Learning on the go
Explore Mobile Apps
Dummies Store
Shop for books and more
Start Shopping

Layer 3 of the Big Data Stack: Organizing Data Services and Tools

Organizing data services and tools, layer 3 of the big data stack, capture, validate, and assemble various big data elements into contextually relevant collections. Because big data is massive, techniques have evolved to process the data efficiently and seamlessly. MapReduce is one heavily used technique. Suffice it to say here that many of these organizing data services are MapReduce engines, specifically designed to optimize the organization of big data streams.

Organizing data services are, in reality, an ecosystem of tools and technologies that can be used to gather and assemble data in preparation for further processing. As such, the tools need to provide integration, translation, normalization, and scale. Technologies in this layer include the following:

  • A distributed file system: Necessary to accommodate the decomposition of data streams and to provide scale and storage capacity

  • Serialization services: Necessary for persistent data storage and multilanguage remote procedure calls (RPCs)

  • Coordination services: Necessary for building distributed applications (locking and so on)

  • Extract, transform, and load (ETL) tools: Necessary for the loading and conversion of structured and unstructured data into Hadoop

  • Workflow services: Necessary for scheduling jobs and providing a structure for synchronizing process elements across layers

  • Add a Comment
  • Print
  • Share
blog comments powered by Disqus
Advertisement
Advertisement

Inside Dummies.com

Dummies.com Sweepstakes

Win an iPad Mini. Enter to win now!