Enterprises are generating huge volumes of data every year with an average annual data growth of 40-50%. This growth has to be handled using IT budgets that are only growing at an annual average of 7%. Such disproportion creates a challenge for mainframe professionals: how can they store all this data cost-effectively?
Particularly challenging is deciding on the right strategy for long-term storage, also known as cold storage, for archived data that is rarely or never accessed. There can be different causes for keeping such data for the long term, which often lasts years or even decades:
- Financial data is stored for compliance and might be required in case of an audit
- Legal information must be kept in case of legal action
- Medical archives are stored in vast quantities and their availability is highly regulated
- Government data has to be stored for legal reasons, sometimes even indefinitely
- Raw data is stored by many enterprises for future data mining and analysis
Desired attributes of a cold storage solution
Cold storage, also referred to as “Tier 3 storage,” has different needs than Tier 0 (high-performance), Tier 1 (primary), and Tier 2 (secondary) storage. These are some of the considerations to keep in mind when designing your cold storage solution:
- Scalability – As the amount of generated data doubles in less than two years on average, your cold storage technology needs to be infinitely scalable accordingly.
- Cost – Cold storage must be as inexpensive as possible especially because you will need a lot of it. Luckily, as it is rarely accessed it allows compromising on accessibility and performance, which can be leveraged to reduce cost.
- Durability and Reliability – Reliability is the ability of a storage media not to fail within its durability time frame. Both are important to check, and you will find that some cold storage options are durable but not necessarily as reliable as others, and vice versa.
- Accessibility – Cold storage is meant only for data that does not need to be accessed very often or very rapidly, yet the ability to access it is still important. As mentioned above, compromising on this aspect enables a lower cost.
- Security – The security of cold data is vital. If it is stored onsite you need to take the same security precautions as with your active data. If it is in the cloud, you must ensure the vendor has proper security mechanisms in place.
Cold storage technology options for mainframe
Mainframe professionals have three general technology options when it comes to cold storage: tape, virtual tape, and cloud. While tapes are still the dominant cold storage media for mainframes, cloud is gaining momentum with its virtually limitless storage and pay-as-you-go model.
Here is a summary of these technologies, and their relative advantages and disadvantages:
Tape
Pros of Tape:
- Often cheaper than other options, depending on the use case
- Full control over where data is stored
- Secure and not susceptible to malware or viruses as it is offline
- Portable and can be carried or sent anywhere
- Easy to add capacity
Cons of Tape:
- Capital investment required for large tape libraries
- Difficult to access (slow and with bottlenecks)
- High recovery time objective (RTO)
- Requires physical access and manual handling (problematic in lockdown, for example)
- Requires careful maintenance
Virtual Tape Libraries (VTL)
Pros of VTL:
- Scalability – HDDs added to a VTL are perceived as tape storage to the mainframe
- Performance – data access is faster than tape or cloud
- Compatibility – works with tape software features like deduplication
- Familiarity – behaves like traditional tape libraries
- Cost varies. Infrastructure, maintenance, and skilled admins should also be considered
- Capital investment required
- Usually less reliable than other options
- Less secure than offline tapes and lacks the latest security features of cloud platforms
Cloud Storage
Pros of Cloud:
- Can be cheaper, especially when being aware of hidden costs
- Can improve cash flow thanks to an OpEx financial model rather than CapEx
- Infinitely scalable
- Accessible from anywhere
- Advanced data management
- High data redundancy and easy replication
- Leading-edge security
- Easy to integrate with mainframes
Cons of Cloud:
- Hidden costs (depends on use)
- Data retrieval, backup, and RTO times depend on network bandwidth
Cloud is Rising as a Mainframe Cold Storage Choice
The cloud storage market is expected to reach $88.91 billion by 2022 growing at a CAGR of 23.7%—much higher than the CAGRs of all the other cold storage options combined. Cold storage in the cloud offers a unique combination of scalability, reliability, durability, security, and cost-effectiveness that on-prem options are challenged to meet.
So, in which cases cloud is preferable for cold storage over tape and VTL?
- When data access frequency changes: The cloud offers different cold storage tiers, based on the data access requirements, that balance between data storage cost and the data access frequency. Cold storage tiers can be cost effective, however with high data access frequency you need to be mindful of choosing a service that addresses those access needs.
- When the data grows quickly or unpredictably: Cloud platforms can scale to infinity with very little effort, unlike on-prem options.
- When improving cash flow is a priority: Predictable OpEx monthly fees can improve cash flow compared to large upfront investment in on-prem storage and infrastructure.
- In case of mainframe skills shortage: Attracting and retaining mainframe experts is a challenge to many enterprises. With cloud cold storage, this problem completely goes away.