One of the questions posed to the “ask me anything” storage panel at this week’s TECHunplugged event was whether we will ever see a single storage protocol develop. This is an interesting idea and with the move to object storage, seems to have some merit. However as always with technology, the answer “it depends” seems the most obvious, but let’s dig down into what “depends” could actually mean in this case.
There are and have been lots of ways to access data stored on persistent storage, including disks in servers as well as external storage arrays. This includes (but is not a comprehensive list):
- ATA – AT Attachment interface for early PCs, that derived from IDE (Integrated Drive Electronics) and provided us with PATA (parallel) and SATA (serial) interfaces.
- SCSI – Small Computer Systems Interface, originally a parallel interface for more commercial servers, SCSI gives us SAS (Serial Attached SCSI) and is the protocol that sits on top of Fibre Channel and iSCSI.
- IBM CCW – Channel Command Words, that combine to create a “channel program” for mainframe I/O, running over Bus & Tag, ESCON and later FICON interfaces.
- Fibre Channel – derivation of ESCON for open systems, widely used today for external storage arrays and internally in some legacy platforms for connecting drives. We also had an attempt to use Fibre Channel over Ethernet, but that wasn’t successful.
- CIFS/SMB – client server protocol for accessing file-based data on Windows servers (although originally designed by IBM) and significantly extended in the latest SMB3.x releases.
- NFS – Client/server file access protocol developed by Sun Microsystems in the mid-1980’s and widely used today.
- REST API, SOAP – HTTP/web-based protocols for accessing object data across local and wide area networks.
There are other protocols (more on those later), but here we’re covering the most common and typically those that are used for connecting external storage. Pretty much all of the I/O protocols fall into three main categories:
- Block – ability to address and access individual blocks of data with high granularity (as little as 512 bytes) and low latency per I/O request. Data is stored on LUNs or volumes, with the storage device having little or no intelligence on the content of the data. Access requires some additional formatting (e.g. layering a file system onto a LUN) or intelligence in the application (e.g. a database accessing raw volumes).
- File – data is accessed as files in a hierarchical structure (e.g. the file system) that provides advanced features like data security (ACLs), file locking (serialising access to one process at a time), metadata (date/time accessed, file size etc) and user-friendly object names. Caching allows sub-file updates, rather than having to retrieve/store the entire file with each modification.
- Object – data is stored and accessed as binary objects, with no directly format understanding by the object platform, although internally an object may have structure and may be accompanied by metadata to help understand the object in more detail. Typically object stores don’t allow sub-object updates, with create/retrieve/update commands effectively working on the entire object as an “atomic” operation. Object stores are more scalable than file servers, typically capable of storing larger and more objects without the overhead of a file system. Object stores work on web-based protocols.
At the transport layer, modern systems use either Ethernet, Fibre Channel or FICON. FICON is restricted to the IBM mainframe domain and we can set that aside as part of this discussion. As a dedicated transport, Fibre Channel has significant benefits over Ethernet, such as lossless delivery but was designed to be a local protocol and needs extensions to work over wide area networks.
Horses for Courses
Already we can see how each protocol has developed to meet specific requirements. Block-based systems are great for transactional data where the update granularity is small and potentially relatively random. File provides much more structure over data, with object sitting somewhere in between. Object does better in scalability – but not in latency/performance. In fact latency in object stores isn’t that relevant a measure, instead we should look at throughput and time to first byte.
In order to try and rationalise protocols and transports, we could focus on Ethernet. However as already mentioned, there are issues with standard Ethernet and that was meant to be addressed with Data Centre Bridging and FCoE. Unfortunately for the proponents of this technology, traditional Fibre Channel remains remarkably stubborn to shift, which is not surprising for any technology that has such a huge investment in it from end users. Of course by that I’m not specifically referring to hardware, but also the knowledge and experience in building storage networks that has to be relearned.
There are some consolidations that could be done. File is really a specific form of the object protocol, using the same transport and IP-based technology to access data. Many object stores are moving to support file-based protocols natively, so it’s easy to see how file and object could merge. In fact, many storage systems (for example OneBlox from Exablox) already use object as the underlying storage mechanism, yet present data as traditional file.
Block is however more difficult to deal with. A LUN or volume could be emulated by an object store that supported sub-object updating. However, object store systems are typically designed for scale (meaning capacity) and would have trouble with small-block updates. Erasure coding as a data protection mechanism within object stores is one example of where managing small block updates would be a real performance issue.
I’m sure some readers are thinking that we’ve already started to manage other object types, with the introduction of VVOLs for VMware and Tintri’s native support over NFS. In fact VVOLs are a kludge that simply uses block-based protocols like Fibre Channel to create multiple LUNs (one for each file & type) and provides external management capability. It isn’t actually a new protocol as such.
The Architect’s View™
As one of the other panellists pointed out, we will have to accept the need for multiple I/O protocols, although we could rationalise the transport layer. Having more intelligence in block protocols would also help; for example having sub-LUN locking.
What’s not in here is any discussion about high performance protocols like Infiniband, iWARP, RoCE (RDMA over Converged Ethernet) or NVMe over fabrics, which could become more mainstream over time. As we can see, the effort to reduce protocols is being frustrated by the need to develop faster, more efficient technologies. So the hope of massive protocol consolidation is some way off, or more likely will never happen.
Copyright (c) 2009-2022 – Brookend Ltd, first published on http://www.architecting.it/blog, do not reproduce without permission. Post #b3ac.