This is the fifth post in a series looking at predictions for the storage industry in 2021. The first four posts are here:
- Storage Predictions for 2021 and Beyond (Part I – Media)
- Storage Predictions for 2021 and Beyond (Part II – Systems)
- Storage Predictions for 2021 and Beyond (Part III – SDS)
- Storage Predictions for 2021 and Beyond (Part IV – CAS)
Open Source represents a revolution in the way software-defined storage and other products are developed. Could the open-source model threaten the future of commercial storage software, or is it just another consumption choice? What can we expect in this area over the coming decade?
The concept of Open Source is based on the free software movements that have existed for decades. Across all parts of the industry, software is developed and distributed for free, either as a labour of love, as part of a business model or as part of research work. Modern open-source software is generally developed and made available through licensing models that give users certain rights of use and modification. Most require attribution back to the original authors and place restrictions on how derived works are shared or used.
Why should developers make software freely available to others? Creating software is both time-intensive and requires considerable skills that could be rewarded through a proprietary development model. Arguably, the open-source model offers the following benefits (and others):
- Collaboration – software developers can share thoughts, ideas, coding techniques, concepts and other facets across a broad audience of potential contributors. This process aids in diversity of thought, as well as assisting in bug tracking and resolution.
- Time to Market – with potentially hundreds or thousands of eager coders, projects can develop quicker than a commercial company could achieve in a similar timescale (subject to rigorous acceptance criteria on code development and promotion).
- Acceptance – free software has a way of being accepted much quicker than commercial products. We know from the early Shareware phase of free software that free is always popular, even if a product isn’t that good. (I recommend reading a book such as “Predictably Irrational” which discusses the concept of free or zero cost). Early adopters generally promote Open Source.
- Kudos or Recognition – contributing to Open Source is a great way to build industry recognition and reduce barriers to entry for those looking to get into IT. Developing open-source software is like building an online CV.
There is also the aspect of developers wanting to go against the establishment and commercial software by developing their own offerings. This is potentially one altruistic angle, but I doubt this is behind the majority of development projects.
An open-source offering is more likely to gain adoption and recognition compared to a commercial solution from a business perspective,. Once again, the power of “free” is at play. Vendors typically offer open-source products for free, charging for “enterprise” versions or making money through support.
Open-source storage solutions have been around for decades. Free mainframe software from the SHARE user group has been distributed since the 1950s. I personally used SHARE (from the printed copy of the newsletter) in the 1980s and 1990s.
Arguably, modern Open Source and free storage software have developed from the connectivity of the Internet and freely available Linux and Unix operating systems. Over the last ten years, we’ve seen three or four primary areas of development:
- File systems – FUSE, GlusterFS, LizardFS, MooseFS, BTRFS, RozoFS and many more, including ZFS which is included in many operating system distributions (although it spans the commercial/non-commercial divide after Oracle acquired Sun Microsystems).
- Unstructured data stores – this includes Alluxio, Ceph, MinIO, HDFS (part of Hadoop), OpenIO, Lustre and others.
- NAS filers – this category includes FreeNAS, OpenFiler, and OpenMediaVault and generally covers SMB or home user use cases.
- Databases – MySQL, MariaDB, and many other solutions. These platforms have been developed either as alternatives to commercial enterprise products or to create new categories (such as NoSQL).
Probably the only storage category without widespread development is block-based storage systems. DRBD seems to be the only product that is available to build block storage, although some of the other unstructured solutions do offer iSCSI volumes. There are also container-attached storage solutions which we discussed in the previous post.
Looking at the existing market of products, we can see that Open Source has been used as a model to develop software that meets the need of large-scale unstructured data stores. Many educational and scientific organisations use open-source solutions like Ceph because they can be built out cheaply and easily. The support model isn’t generally a problem when there is a ready supply of Computer Science students available. Open Source in this aspect isn’t (currently) challenging commercial platforms.
Where can we expect storage and Open Source to go in the coming decade?
- Increased adoption for structured data. Linux and the public cloud have driven the use of open-source solutions and made them easily consumable. Many NoSQL platforms have developed exclusively from the open-source model. We can expect to see a long-term decline occurring in the use of commercial database software in favour of their equivalent open-source solutions.
- Consolidation & Rationalisation. There are arguably too many open-source file systems available on the market today. This part of the market needs to rationalise file system solutions that offer real value to end-users. Building file systems is hard, and these offerings are least likely to be adopted across commercial and enterprise customers.
- Object storage moves 100% Open Source. Object storage may become entirely delivered through the open-source model. The rationale for this can be explained by the cost/value benefit of paying for storage software to retain large volumes of inactive data or content with no current perceived (but future) value. Existing commercial object storage solutions can then choose to bifurcate and offer commercial versions for high-performance object and file stores.
- Analytics becomes the next battleground. Open-source solutions move “up the stack” and offer solutions for analytics and AI. This area is currently lucrative for commercial solutions and so could generate interest for open-source development.
- Greater Forking and Fragmentation. Although we’re suggesting rationalisation is needed in some areas, some solutions are likely to see increased diversity (as we discuss in a moment).
The Cloud Factor
One interesting aspect in the development of Open Source has seen more restrictive licensing being put in place for solutions from MongoDB and Elastic. The driver for this change has come from the use of these platforms in the public cloud. Open Source is effectively a free treasure trove of resources on which to build a public cloud infrastructure. There is no requirement for cloud providers to pay back by helping in the development of these solutions.
If AWS and other cloud service providers choose to fork popular open-source solutions, then the market will become fragmented with diverging products. Developers will be dis-incentivised from creating new solutions if there is little expectation of future reward for their efforts. This conundrum raises challenges for the future of Open Source. As a model, Open Source provides ready access to developers and a consumer market, albeit at the cost of exposing that intellectual property for others to use.
The Architect’s View
Despite the challenges of the public cloud, I expect to see open-source storage continue to thrive. Open Source is an excellent source of new ideas and directly, or indirectly drives other parts of the industry. Many graduate research projects have turned into open-source storage products, providing a wealth of new opportunities. I hope we can resolve the short-term challenges of Open Source and the public cloud because both need each other and have existed in a symbiotic relationship since the cloud first emerged as a modern business model.
Note: The open-source market is so wide and deep that it’s not possible to cover every product and company in this short post. Feel free to contact us and point out any products and solutions we may have missed.
Related Podcasts & Blogs
- #167 – Adapting Open Source Storage for the Enterprise
- #46 – Another View on Open Source Storage with Neil Levine
- #41 – Does Open Source Have a Place in Storage?
- #189 – The Quiet Success of Software-Defined Storage
- Cloud Services – Build, Buy or Fork?
- Databases are the next battleground for Public Cloud
Copyright (c) 2007-2021 – Post #6882 – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.