What is cloud storage?

January 9, 2014 General, Technical, General

cloudstorageStorage ain’t storage anymore – not in the world of web hosting anyway. Long gone are the days of servers writing to locally installed hard disks; cloud storage is abstracted from the physical server itself, instead residing on a high-speed network of distributed storage clusters. These clusters contain gazillions of hard drives which automatically replicate your data between different nodes for safety, scalability, performance and ease of management.

Storage can be consumed using one or more different interfaces depending on its suitability to the task at hand. Anchor’s distributed storage system is built using Ceph – and we love its speed, flexibility and industry leading reliability. Here’s an overview of the different ways in which Anchor’s storage platform can be consumed and why you might use one interface over the other.

Block Storage

Block level storage manages data as blocks within sectors and tracks; presenting itself to servers using a Fibre Channel or iSCSI interface (an industry standard mechanism traditionally used by SANs) that is supported by most Operating Systems. The server can connect to these volumes and use them in the same way they could an internally installed hard disk; making block storage suitable for almost any application.

As far as your virtual or cloud server is concerned, a disk volume presented as Block storage is simply a hard disk that they can format and use as normal – with the added flexibility, performance and reliability benefits inherent in distributed storage. Taking the benefits a step further; traditional concerns around single points of failure and the storage limitations of physical hard disks are completely negated.

Object Storage

The vast majority of cloud storage providers supply services that make use of the object storage architecture. Object storage thinks of your data as objects, not as blocks or as a file hierarchy.

Object storage is commonly used to store assets such as files, photos and video as they can be served and shared directly – without the need for (but can be used a web server. It is also used for large scale “big data” applications – once you are dealing with millions or even billions of objects, a file system approach does not scale and suffers bottlenecks and major performance problems.

Unlike traditional file systems, Object storage is not dependent on a hierarchical layout of directories and sub-directories. Each object contains the data itself, metadata (which is data about your data) and a unique identifier. Objects cannot be organised and placed inside one another, instead they use the unique identifier to allow a server or end user to retrieve it without knowing the physical location of the data. Object storage is massively scalable without impacting performance, and can be distributed geographically.

Object storage also provides programmatic (API) and HTTP interfaces (just like retrieving a web page) to allow users and applications to manipulate data. Most API implementations are ReST-based, allowing the use of many standard HTTP calls.

When it comes to data durability, Object storage commonly uses Erasure Coding (EC), a software-based data protection scheme in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations, such as disks, storage nodes or geographic locations. Should a hardware failure result in loss of up to two of the primary segments, the system is designed to reconstruct the original data using parity information. This ability to provide data durability also means that less investment has to be made in data protection. Object storage does not need either RAID or backing up – not as we know it for traditional file system storage anyway.

File System (NFS, SMB/CIFS, AFP, etc)

File level storage is accessible using common protocols such as SMB/CIFS & NFS, making it accessible from Windows, Mac and Linux. The file system approach is ideal for human users but not for big data applications that must manage millions of objects. File systems allow humans to organise content in an understandable hierarchy where access speed and programmatic control isn’t of paramount importance.

Nothing beats the simplicity of file level storage when all that’s needed is a place to dump your files. Many of our clients use NFS storage as a centralised, highly available place to store files and other assets that can be accessed from multiple web servers – a common scenario when adding additional front end web servers to cope with periods of increased traffic to your site. Your web servers simply use the centralised asset directory on an NFS share, solving any data replication headaches and making horizontal scaling (that is; adding more web servers behind a load balancer) much simpler to achieve.

If you’re looking for high levels of storage performance, the file level option may not be the best bet. Block and Object level devices are generally far more configurable for high performance.