Understanding Storage Fundamentals in the Hybrid Cloud - dummies

Understanding Storage Fundamentals in the Hybrid Cloud

By Judith Hurwitz, Marcia Kaufman, Fern Halper, Daniel Kirsch

The design of cloud storage in a hybrid cloud environment is similar to other cloud architectures in terms of self-service, elasticity, and scalability. Cloud storage is a technique of abstracting storage with a well-defined interface so it can used as a self-service application. In addition, cloud storage needs to support a multi-tenant architecture so that each consumer’s cloud data is managed in isolation from other consumer’s cloud data.

One of the most important characteristics of cloud storage is how it can dynamically interface with other cloud services, such as SaaS (Software as a Service), PaaS (Platform as a Service), IaaS (Infrastructure as a Service), and BPaaS (Business Process as a Service).

It is not new to think about attaching storage to systems — it has been done since the first systems rolled off the assembly line. Today, most storage environments are connected with systems through a standard interface called SCSI (Small Computer Systems Interface). SCSI is a very mature protocol that is widely adopted because of its reliability and performance.

Cloud storage access protocols

One important issue in cloud storage is the speed and ease of accessing the data when it’s needed. In order for cloud storage to be a viable alternative to on-premises data storage, you need to be able to access your data at a competitive cost and at a time that is appropriate for the situation.

Today, there are four types of cloud storage access methods:

  • Web services application programming interfaces (APIs): These use RESTful APIs (according to the principals of Representational State Transfer) to integrate with applications.

  • File-based protocols: These protocols are used to transfer files and provide integration independent of the application being connected. They also provide a faster integration than web service APIs. Different types are

    • Network File System (NFS)

    • Common Internet File System (CIFS)

    • File Transfer Protocol (FTP)

  • Block-based APIs: These use Internet SCSI to connect a front end to storage middleware that supports services like data replication and data reduction.

  • Web-based Distribution Authoring and Versioning (WebDAV): This is based on Hypertext Transfer Protocol (HTTP).

The most common methods for accessing cloud storage are web service APIs. Cloud storage vendors implement this technology because it’s dynamic and simple to use in the cloud. In addition, because of virtualization in cloud environments, there’s a requirement for a more stateless (no set location for any code) access protocol. Web service APIs support this requirement for statelessness.

Web service APIs need to be integrated with a specific application when used for cloud storage, which can create some challenges. If you want to avoid the need to integrate with an application, file-based protocols and block-based APIs can be used as alternative access methods. Another connection protocol is WebDAV, which is designed to create an efficient cloud storage interface.

Delivery options for cloud storage

How will your cloud provider deliver your storage capability? You can use an appliance or connect to a public or remote storage service.

Although latency is a big issue for primary (tier 1) cloud storage, particularly for data used frequently, vendors are currently offering a different class of products called hybrid cloud storage solutions that may ultimately address primary storage. The idea is to use local and cloud-based resources to address performance issues associated with storage in the cloud.

Generally, these offerings consist of two things:

  • An appliance that is a physical or virtual server where the hardware and software are preconfigured so the user doesn’t have to understand the details

  • A connection to a remote storage service

The appliance intelligently handles the movement between the local storage and the cloud; to the end user, all of the data seems to be in one place.

A cache is a block of memory for temporary storage on the appliance that provides a high-speed buffer between your client and the cloud service. The cache uses a host of algorithms to keep the most frequently used data on the local, expensive hardware.

For read requests, file attributes like the age of the data and time since last accessed are used. For write requests, the appliance may write the data locally on the machine and then burst it out to the cloud storage provider. The data is generally encrypted when it’s transported.

Functions of cloud storage

The type of information you need to store and how quickly you need to access data both have an impact on the type of storage you will use. You can use policy-based replication to enable more granular control over how and where data is stored.

Cloud storage can serve multiple purposes:

  • General-purpose storage for day-to-day or periodic use

  • Data protection and continuity, which can include data replication and backup and restore functionality

  • Archive and records management, meaning recoverable long-term data retention to support compliance and regulatory requirements

Benefits of cloud storage

Some of the benefits of cloud storage include:

  • Agility: The elastic nature of the cloud enables you to gain potentially unlimited storage.

  • Fewer physical devices to purchase and maintain: When you’re storing data in a data center, you have to plan for the servers that will be part of this storage solution. This means you need to purchase the machines and maintain them during their lifecycle. Additionally, you must make sure that you have enough space and can meet power requirements.

    In the cloud, you don’t have to purchase physical devices or deal with environmental issues. The cloud provider should do this for you (but it pays to do your homework on the services that your provider offers).

  • Disaster recovery: The cloud can serve as a good replacement for tape or other backups and can minimize concerns about your own data center capacity to support your backups. Instead of continuing to expand your on-premises storage, your information can be backed up to the cloud. If your systems go down, you can retrieve your data from the cloud.

  • Cost: DAS (direct attached storage) is relatively inexpensive, but NAS (network addressable storage) and SAN (storage area network) devices require significant capital expenditures. The cloud storage model is based on usage — you pay only for what you use.

No solution comes without drawbacks. Off-premises storage can affect performance, which will now be based on connectivity and latency between your LAN/WAN and your cloud provider. Network connectivity can affect performance. Additionally, you need to deal with issues such as the security that your cloud provider puts in place and the availability of your cloud provider.