Many times in my career I’ve been faced with managing legacy secondary storage media. By this, I generally mean tapes, but equally, we see backup data being stored on removable drives or file servers too. One of the biggest challenges in data protection has been to build a plan to deal with legacy media. Data ingest and recovery from old tapes is notoriously hard and expensive. So how can enterprises avoid the problem while dealing with that legacy of old storage?
How Did We Get Here?
Secondary data has a very different usage profile to production or primary data. Where primary data represents the content actively being used by the business, secondary data can be short term backups, archive data or unstructured content that’s being retained for some theoretical future value. In this discussion, we’ll focus just on the backup portion.
- Backup is Your Responsibility – Even in Public Cloud
- Gaps in Cloud Native Data Protection
- Backup as a Service
If we compare primary and secondary data, we see that secondary content has a much longer “tail” than primary. We keep backups (and archives) for weeks, months, even years for multiple reasons that we’ll discuss in a moment. Over time, the value of that data changes. We’re likely to restore from the most recent backups, simply because we usually detect a problem (like a corrupt or accidentally deleted file) fairly quickly after it occurs.
Some data recovery requirements can be more subtle. Ransomware has introduced the need to keep data for extended periods because intrusions aren’t always detected immediately. Ransomware attacks can target the backup platform first, to prevent recovery of maliciously encrypted data. So, secondary data also needs to be “air-gapped”. Over time, the requirement to restore from a backup will diminish, decreasing the value of keeping it around.
Why Do We Have the Problem?
What caused the retention of old media and backup data? We can identify at least the following if not more, scenarios.
- Poor Man’s Archive. Backup has always been used in place of a good archive solution. Building archiving functionality into an application can be challenging, especially with structured content. It’s much easier to simply keep system backups over time and restore an entire application to access archived data. Implementing archiving is good policy because it reduces the time to back up and restore.
- Compliance. Many businesses retain data for long periods to meet compliance rules. It’s common to see backups being retained for up to 10 years. Again, this is another archiving workaround.
- Poorly Understood Backup Requirements. It’s typical to see the IT teams setting data protection standards, such as frequency of backup, retention times, RPOs and RTOs. Many lines of business may not know their requirements and simply default to an over-generous standard policy.
- Lack of Clear Data Management Policies. Extending the last point further, businesses need end-to-end data management policies that determine exactly how data is created, stored, protected and retained. The last point is particularly relevant as having an active deletion policy is as important as having retention standards.
- Lack of Industry Standards. There are no generic standards for the format of backup data. Each vendor chooses their own and they are rarely self-describing. Determining the information on a tape cartridge means trawling end-to-end with software capable of reading the format of the content.
One interesting aspect of the conflation of archiving and backup is the development of new data protection laws. GDPR introduced challenges such as meeting the “right to be forgotten”. How should IT organisations ensure that a restored backup doesn’t re-introduce previously deleted customer records, for example?
Another side effect to bear in mind with historical backups is the challenge of identification. When we had long-lived physical servers, it was relatively easy to associate an application to a server. With virtualisation, that process got harder. With containerisation, that process will get harder still. This is one reason for abstracting the application and its encapsulation as a separate inventory.
One solution that has been used by many IT organisations is to create a “museum environment”. Essentially this means retaining the infrastructure (hardware and software) to re-instantiate a backup environment if and when a restore from old media is required. Museum environments create their own challenges:
- Software licences may expire and only be noticed at restore time. Workarounds like regressing the date/time can sometimes solve this problem. (Remember HourGlass anyone?)
- Reading old media is a problem. LTO tapes, for example, can only be read by two drive generations forward (e.g. LTO-2 tapes can be read by LTO-3 & LTO-4 drives). Museum environments mean keeping a lot of legacy hardware around.
- Skills may not be available to do the recovery. As IT organisations transition to new software, the skills and knowledge of older solutions may have to be contracted in.
- The data being restored may be within an operating system or application that is no longer supported. All of the above backup software issues may also apply to the application software & O/S.
Keeping track of museum requirements in large organisations can become too costly, to the extent that it may be simpler to hold on to old media and pay for restoration services if/when recovery is required. However, that still doesn’t get over the challenge of knowing what is backed up on a specific tape or removable disk.
Plan of Action
How can we solve the problem of legacy backups? As the famous joke about the Irishman giving directions goes – “I wouldn’t start from here”.
Unfortunately, that’s a luxury many IT organisations don’t have. However, that doesn’t preclude the need to get your backup house in order before tackling the legacy content. As a first step, get your existing data protection on a good footing. Specifically:
- Set protection policies that meet the needs of data protection and not archive. This step will involve discussions with lines of business and the data controller(s) within the company. I would also publish the policy terms and the reasons behind them, so the process is clear and unambiguous (or at least challengeable).
- Look at alternative solutions for archiving. This is not likely to be a short fix if the applications in the business haven’t been built to manage this process. One solution may be to create a separate backup policy for archive, while the problem is addressed at the application layer. This equally applies to backups retained for test/dev purposes.
- Align backups with policy. Remember that implementing data protection doesn’t have to be based on a single product. All data protection is a combination of solutions (snapshots, clones, backups, replication) that create point-in-time recovery images. This process is also best done in conjunction with business owners, so the implications of any changes are well understood.
- Purge backups outside policy. This again is best done in conjunction with the business, but essentially, all non-standard backups should be released, excluding documented exceptions.
- Recycle/release old media. If this means disposing of tapes, then use your agreed disposal process or provider. Obviously, backups that have transitioned to disk or are via snapshot processes will need to be handled differently.
- Create a repository of applications. Businesses should have a separate inventory of applications that can be matched to the physical server, virtual server, object store bucket or file share containing data. If you don’t have one – create one now, even if it is just on an Excel spreadsheet.
OK, so that still doesn’t address the main premise of this article and that’s dealing with legacy backups.
Proactive or Ad hoc
The fear of the unknown makes deciding what to do with legacy backups such a challenge. It’s difficult to give every business a single recommendation on how they should proceed with this problem. For organisations with tens of thousands of pieces of media (and I’ve seen that and more), the cost of recovering and converting all backup data will be too onerous.
These businesses might as well work on a combination of age and ad-hoc requirements. By this I mean, once a period of time has passed (e.g. 5 years since we catalogued that tape), then the media is disposed of. Where the business needs a legacy restore, this is done on-demand using one of many external companies that offer this service. At the same time, the contents of the tape can be catalogued, and a conversion or disposal plan made for backups on that media.
For smaller businesses, it makes sense to look at building a plan to convert backups from external media and start writing to a public cloud or a cheap object storage platform onsite.
The Architect’s View
None of these discussions addresses the big gap that remains in the market and that is the ability for existing backup solutions to ingest foreign media/content for use in the current backup platform. This seems to me, to be a huge missed opportunity for new and old players alike.
Other than one solution that was acquired by IBM (Butterfly Software), why does this gap still exist? I would love to hear the thoughts of start-ups and incumbents on why this feature isn’t offered. My thought is that the benefit of developing such a complex feature is likely outweighed by the benefits.
But for the online and SaaS data protection solutions on the market, this seems like a killer feature for getting a customer to move over to their platform. This is because the solution solves both a real problem and adds instant recurring revenue for the company.
Copyright (c) 2007-2019 Brookend Ltd, no reproduction without permission, in part or whole. Post #7c79.