Layer 0 of the Big Data Stack: Redundant Physical Infrastructure
At the lowest level of the big data stack is the physical infrastructure. Your company might already have a data center or made investments in physical infrastructures, so you’re going to want to find a way to use the existing assets.
Big data implementations have very specific requirements on all elements in the reference architecture, so you need to examine these requirements on a layer-by-layer basis to ensure that your implementation will perform and scale according to the demands of your business.
A prioritized list of big data principles should include statements about the following:
Performance: How responsive do you need the system to be? Performance, also called latency, is often measured end to end, based on a single transaction or query request.
Availability: Do you need a 100 percent uptime guarantee of service? How long can your business wait in the case of a service interruption or failure?
Scalability: How big does your infrastructure need to be? How much disk space is needed today and in the future? How much computing power do you need? Typically, you need to decide what you need and then add a little more scale for unexpected challenges.
Flexibility: How quickly can you add more resources to the infrastructure? How quickly can your infrastructure recover from failures?
Cost: What can you afford? Because the infrastructure is a set of components, you might be able to buy the “best” networking and decide to save money on storage. You need to establish requirements for each of these areas in the context of an overall budget and then make trade-offs where necessary.
As big data is all about high-velocity, high-volume, and high-data variety, the physical infrastructure will literally “make or break” the implementation. Most big data implementations need to be highly available, so the networks, servers, and physical storage must be both resilient and redundant. Resiliency and redundancy are interrelated.
An infrastructure, or a system, is resilient to failure or changes when sufficient redundant resources are in place, ready to jump into action. Redundancy ensures that such a malfunction won’t cause an outage. Resiliency helps to eliminate single points of failure in your infrastructure.
In large data centers with business continuity requirements, most of the redundancy is in place and can be leveraged to create a big data environment. In new implementations, the designers have the responsibility to map the deployment to the needs of the business based on costs and performance.
This means that the technical and operational complexity is masked behind a collection of services, each with specific terms for performance, availability, recovery, and so on. These terms are described in service-level agreements (SLAs) and are usually negotiated between the service provider and the customer, with penalties for noncompliance.
In effect, this creates a virtual data center. Even with this approach, you should still know what is needed to build and run a big data deployment so that you can make the most appropriate selections from the available service offerings. Despite having an SLA, your organization still has the ultimate responsibility for performance.
Physical redundant big data networks
Networks should be redundant and must have enough capacity to accommodate the anticipated volume and velocity of the inbound and outbound data in addition to the “normal” network traffic experienced by the business. As you begin making big data an integral part of your computing strategy, it is reasonable to expect volume and velocity to increase.
Infrastructure designers should plan for these expected increases and try to create physical implementations that are “elastic.” As network traffic ebbs and flows, so too does the set of physical assets associated with the implementation. Your infrastructure should offer monitoring capabilities so that operators can react when more resources are required to address changes in workloads.
Manage big data hardware: Storage and servers
Likewise, the hardware (storage and server) assets must have sufficient speed and capacity to handle all expected big data capabilities. It’s of little use to have a high-speed network with slow servers because the servers will most likely become a bottleneck. However, a very fast set of storage and compute servers can overcome variable network performance. Of course, nothing will work properly if network performance is poor or unreliable.
Big data infrastructure operations
Another important design consideration is infrastructure operations management. The greatest levels of performance and flexibility will be present only in a well-managed environment. Data center managers need to be able to anticipate and prevent catastrophic failures so that the integrity of the data, and by extension the business processes, is maintained. IT organizations often overlook and therefore underinvest in this area.