In doing some work on composability and the capabilities we expect from rack-scale or composable infrastructure, I’ve been thinking about the evolution of technology over the last 40 years. I’ve reached the conclusion that the (IBM) mainframe was the original rack-scale solution and could be a model for where we head back to in the future.
Composability vs Shareability
Take a look at figure 1. This shows a diagram representing the evolution of compute unit size versus the degree of composability available in that platform.
Over time, we’ve moved from large-scale monolithic mainframes, through mini-computers, to servers and eventually light-weight edge devices. At each step, the degree of composability has reduced.
On mainframes, we had pretty much total composability. Any resource (CPU, memory, networking, storage) could be mapped to a logical partition or virtual partition running under VM.
The mapping features were all dynamic. With the later versions of MVS/ESA (System/390 architecture) and onwards, it was possible to add and remove processors dynamically to a partition (LPAR). Memory could be added and removed (although the real memory slots needed to be evacuated/drained, so the process wasn’t instant).
Storage could be added and removed dynamically, and network definitions changed on the fly.
Many of these capabilities have been around for 30 years. Storage functionality was implemented through features like ESCON and EMIF that would be analogous to storage area networking and SR-IOV today.
Mainframe software in the form of MVS (now z/OS) and VM (now z/VM) provided the infrastructure configurability to change definitions to running systems. Amdahl mainframe systems were more sophisticated and provided capabilities to oversubscribe processors and logically apply variable weightings to each partition running a z/OS system.
Internally, z/OS runs either batch work (jobs), long-running jobs (started tasks) or user sessions (TSO). Each task runs as a separate set of processes with a virtual address space and both isolated and shared memory. In some respects, batch jobs are like containers, whereas started tasks are similar to Linux daemons or Windows services.
Another subsystem, CICS, also provides transactional-based processing that looks similar to the implementation of containers. I’m being careful here to say “similar” as there’s no direct translation between the two. In fact, CICS operates in a single shared address space, so is both similar and dissimilar to containers in different aspects.
OK, so the IBM mainframe of 30-40 years ago looked like rack-scale, but so what? Look at figure 1 again and you can see that over time, we’ve both reduced the compute size of our hardware and reduced the sharing capability too. Individual resources have become much more isolated into separate servers.
Some open systems platforms of the late 1990s/2000s did offer a degree of composability. We can think of examples like Sun 12K and 15K systems, Egenera and even blade servers. But as we’ve moved further to the right in the diagram, server hardware has become smaller in form factor and (generally) compute unit size, while reducing the level of shared infrastructure.
With server virtualisation, does it matter so much that we’ve moved away from composable infrastructure? We have the ability to cluster physical servers and dynamically move virtual machines. This means we can rebalance workloads across physical infrastructure pretty easily with tools like vMotion and DRS.
Possibly the only drawback here is if we want to build virtual machines bigger than a single server or consuming the majority of resources from a single server. In this case, there’s a risk of wasting resources or having to break an application down into smaller components.
This brings us nicely on to the microservices architecture, where we can run applications as a set of processes on a single operating system instance. The dispatching of containers across the hardware can be controlled by demands for compute, memory and storage resources, potentially at a better level of granularity than virtual machines.
So, with the ability to distribute application workloads across physical infrastructure, why are vendors pushing to develop composable solutions? I can see a number of scenarios:
- Choice – I want to be able to control the components of my system, for example, varying the processor to memory ratio or picking specific processors out for performance. I might also want to add in GPUs or other custom devices.
- Predictability – this extends the choice argument. A mixed infrastructure environment might not provide consistent application performance if applications are simply despatched on an amorphous pool of heterogeneous resources.
- Security – I might not want all of my resources (like storage) to be available across all servers, for example.
- Failure Domain – Building out discrete groups of infrastructure provides the ability to manage failure more easily.
- Human Factors – we could quote this generally as risk, but specifically here the risks involved with adding and removing infrastructure that involves bodies in the data centre. A “zero-touch” approach is much more desirable than reconfiguring hardware manually, however infrequently performed.
Looking back to the mainframe days, although z/OS software was highly resilient, an outage had a huge impact. So, there’s generally a need to balance flexibility with risk.
Units of Measure
The interesting aspect of composability will be in what level of granularity we can achieve. Today, we can distribute storage resources easily with NVMe over Fabrics. PCI Express also allows some additional configurations, like GPUs and FPGAs.
Getting to the next stage of composability is going to be tricky. Today’s architectures put processors and memory on a physically short local bus. Extending this outside of a single server may naturally introduce unacceptable latency.
Perhaps the first step in memory composability will be introduced with technologies like Gen-Z and persistent memory that don’t deliver the same levels of latency as DRAM. In this instance, a composable system could have extra persistent memory mapped to it for a fixed period of time while running specific applications that need a large memory address space, for example.
The Architect’s View®
Although the mainframe might seem like a model of composability, the systems that I’m referring to were far from practical. Installation and deployment were hugely expensive and time-consuming, although I expect modern z/OS systems are a lot better. Mainframes do show that having software-configurable resources has a huge benefit, but I don’t think we’ll be going back to wide-scale mainframe deployment any time soon.
Instead, I think we will see rack-scale and composable solutions start with the relatively easy peripheral devices like storage. With NVMe over Fabrics, centralising storage and providing access to individual servers can be done over Ethernet at sub-100µs and better.
This will extend out (and already has) with technologies like that from Liqid, which can add custom processors. Total composability may be an impossible dream, although some of the work done by HPE on The Machine could get us close.
In the meantime, composability will depend on using the benefits of application virtualisation to provide a compromise that is good enough for most modern applications.
No reproduction without permission. Post #0F45.