Category: Cloud storage
One of the great revelations for those considering new or expanded cloud adoption is the cost factor – especially with regard to storage. The received wisdom has long been that nothing beats the low cost of tape for long-term and mass storage.
In fact, though tape is still cheap, cloud options are getting very close such as with Amazon S3 Glacier Deep Archive, and offer tremendous advantages that tape can’t match. A case in point is Amazon S3 Intelligent-Tiering.
Tiering (also called hierarchical storage management or HSM) is not new. It’s been part of the mainframe world for a long time, but with limits imposed by the nature of the storage devices involved and the software. According to Amazon, Intelligent Tiering helps to reduce storage costs by up to 95 percent and now supports automatic data archiving. It’s a great way to modernize your mainframe environment by simply moving data to the cloud, even if you are not planning to migrate your mainframe to AWS entirely.
How does Intelligent-Tiering work? The idea is pretty simple. When objects are found to have been rarely accessed over long periods of time, they are automatically targeted for movement to less expensive storage tiers.
Migrate Mainframe to AWS
In the past (both in mainframes and in the cloud) you had to define a specific policy stating what needed to be moved to which tier and when, for example after 30 days or 60 days. The point with the new AWS tiering is that it automatically identifies what needs to be moved, when, and then moves it at the proper time. To migrate mainframe to Amazon S3 is no problem because modern data movement technology now allows you to move both historical and active data directly from tape or virtual tape to Amazon S3. Once there, auto-tiering can transparently move cold and long-term data to less expensive tiers.
This saves the trouble of needing to specifically define the rules. By abstracting the cost issue, AWS simplifies tiering and optimizes the cost without impacting the applications that read and write the data. Those applications can continue to operate under their usual protocols while AWS takes care of selecting the optimal storage tier. According to AWS, this is the first and, at the moment, the only cloud storage that delivers this capability automatically.
When reading from tape, the traditional lower tier for mainframe environments, recall times are the concern as the system has to deal with tape mount and search protocols. In contrast, Amazon S3 Intelligent-Tiering can provide a low millisecond latency as well as high throughput whether you are calling for data in the Frequent or Infrequent access tiers. In fact, Intelligent-Tiering can also automatically migrate the most infrequently used data to Glacier, the durable and extremely low-cost S3 storage class for data archiving and long-term backup. And with new technology allowing efficient and secure data movement over TCP/IP, getting mainframe data to S3 is even easier.
The potential impact on mainframe data practices
For mainframe-based organizations this high-fidelity tiering option could be an appealing choice compared with tape from both a cost and benefits perspective. However, the tape comparison is rarely that simple. For example, depending on the amount of data involved and the specific backup and/or archiving practices, any given petabyte of data needing to be protected may have to be copied and retained two or more times, which immediately makes tape seem a bit less competitive. Add overhead costs, personnel, etc., and the “traditional” economics may begin to seem even less appealing.
Tiering, in a mainframe context, is often as much about speed of access as anything else. So, in the tape world, big decisions have to be made constantly about what can be relegated to lower tiers and whether the often much-longer access times will become a problem after that decision has been made. But getting mainframe data to S3, where such concerns are no longer an issue, is now easy. Modern data movement technology means you can move your mainframe data in mainframe format directly to object storage in the cloud so it is available for restore directly from AWS.
Many mainframe organizations have years, even decades of data on tape. The management of this tape data is retained only in the tape management system. Or perhaps it was just copied forward from a prior tape system upgrade. How much of this data is really needed? Is it even usable anymore? To migrate mainframe to AWS, specifically this older data, allows management of the data in a modern way and can reduce the amount of tape data on-premises.
And what about those tapes that today are shipped off-site for storage and recovery purposes? Why not put that data on cloud storage for recovery anywhere?
For mainframe organizations interested in removing on-premise tape technology, reducing tape storage sizes, or creating remote backup copies, cloud options like Amazon S3 Intelligent Tiering can offer cost optimization that is better “tuned” to an organization’s real needs than anything devised manually or implemented on-premises. Furthermore, with this cloud-based approach, there is no longer any need to know your data patterns or think about tiering, it just gets done.
Best of all, you can now perform a stand-alone restore directly from cloud. This is especially valuable with ransomware attacks on the rise because there is no dependency on a potentially compromised system.
You can even take advantage of AWS immutable copies and versioning capabilities to further protect your mainframe data.
Of course, in order to take advantage of cloud storage like Amazon S3 Intelligent Tiering, you need to find a way to get your mainframe data out of its on-premises environment. Traditionally, that has presented a big challenge. But, as with multiplying storage options, the choices in data movement technology are also improving. For a review of new movement options, take a look at a discussion of techniques and technologies for Mainframe to Cloud Migration.
Introducing object storage terminology and concepts – and how to leverage cost-effective cloud data management for mainframe
Object storage is coming to the mainframe. It’s the optimal platform for demanding backup, archive, DR, and big-data analytics operations, allowing mainframe data centers to leverage scalable, cost-effective cloud infrastructures.
For mainframe personnel, object storage is a new language to speak. It’s not complex, just a few new buzzwords to learn. This paper was written to introduce you to object storage, and to assist in learning the relevant terminology. Each term is compared to familiar mainframe concepts. Let’s go!
What is Object Storage?
Object storage is a computer data architecture in which data is stored in object form – as compared to DASD, file/NAS storage and block storage. Object storage is a cost-effective technology that makes data easily accessible for large-scale operations, such as backup, archive, DR, and big-data analytics and BI applications.
IT departments with mainframes can use object storage to modernize their mainframe ecosystems and reduce dependence on expensive, proprietary hardware, such as tape systems and VTLs.
Let’s take a look at some basic object storage terminology (and compare it to mainframe lingo):
- Objects. Object storage contains objects, which are also known as blobs. These are analogous to mainframe data sets.
- Buckets. A bucket is a container that hosts zero or more objects. In the mainframe realm, data sets are hosted on a volume – such as a tape or DASD device.
Data Sets vs. Objects – a Closer Look
As with data sets, objects contain both data and some basic metadata describing the object’s properties, such as creation date and object size. Here is a table with a detailed comparison between data set and object attributes:
The object attributes described below are presented as defined in AWS S3 storage systems.
Volumes vs. Buckets – a Closer Look
Buckets, which are analogous to mainframe volumes, are unlimited in size. Separate buckets are often deployed for security reasons, and not because of performance limitations. A bucket can be assigned a life cycle policy that includes automatic tiering, data protection, replication, and automatic at-rest encryption.
The bucket attributes described below are presented as defined in AWS S3 storage systems.
In the z/OS domain, a SAF user and password are required, as well as the necessary authorization level for the volume and data set. For example, users with ALTER access to a data set can perform any action – read/write/create/delete.
In object storage, users are defined in the storage system. Each user is granted access to specific buckets, prefixes, objects, and separate permissions are defined for each action, for example:
In addition, each user can be associated with a programmatic API key and API secret in order to access the bucket and the object storage via a TCP/IP-based API. When accessing data in the cloud, HTTPS is used to encrypt the in-transit stream. When accessing data on-premises, HTTP can be used to avoid encryption overhead. If required, the object storage platform can be configured to perform data-at-rest encryption.
Disaster Recovery Considerations
While traditional mainframe storage platforms such as tape and DASD rely on full storage replication, object storage supports both replication and erasure coding. Erasure coding provides significant savings in storage space, as the data can be spread over multiple geographical locations. For example, on AWS, data is automatically spread across a minimum of 3 geographical locations, thus providing multi-site redundancy and disaster recovery from anywhere in the world. Erasure-coded buckets can also be fully replicated to another region, as is practiced with traditional storage. Most object storage platforms support both synchronous and asynchronous replication.
Model9 – Connecting Object Storage to the Mainframe
Model9’s Cloud Data Manager for Mainframe is a software-only platform that leverages powerful, scalable cloud-based object storage capabilities for data centers that operate mainframes.
The platform runs on the mainframe’s zIIP processors, providing cost-efficient storage, backup, archive, and recovery functionalities with an easy-to-use interface that requires no object-storage knowledge or skills.
Everyone is under pressure to modernize their mainframe environment – keeping all the mission-critical benefits without being so tied to a crushing cost structure and a style of computing that often discourages the agility and creativity enterprises badly need.
Several general traits of cloud can deliver attributes to a mainframe environment that are increasingly demanded and very difficult to achieve in any other way. These are:
Leading cloud providers have data processing assets that dwarf anything available to any other kind of organization. So, as a service, they can provide capacity and/or specific functionality that is effectively unlimited in scale but for which, roughly speaking, customers pay on an as-needed basis. For a mainframe organization this can be extremely helpful for dealing with periodic demand spikes such as the annual holiday sales period. They can also support sudden and substantial shifts in a business model, such as some of those that have emerged during the COVID pandemic.
The same enormous scale of the cloud providers that delivers elasticity, also delivers resilience. Enormous compute and storage resources, in multiple locations, and vast data pipes guarantee data survivability. Cloud outages can happen, but the massive redundancy makes data loss or a complete outage, highly unlikely.
The ‘pay only for what you need’ approach of cloud means that cloud expenses are generally tracked as operating expenses rather than capital expenditures and, in that sense, are usually much easier to fund. If properly managed cloud services are usually as cost-effective as on-premises and sometimes much more, though of course complex questions of how costs are logged factor into this. Unlike the mainframe model, there is no single monthly peak 4-hour interval that sets the pricing for the whole month. Also, there is no need to order storage boxes, compute chassis and other infrastructure components, nor track the shipment and match the bill of materials, or rack and stack the servers, as huge infrastructure is available at the click of a button.
Finally, cloud represents a cornucopia of potential solutions to problems you may be facing, with low compute and storage costs, a wide range of infrastructure-as-a-service (IaaS), platform-as-a-service (PaaS), and software-as-a-service (SaaS) options – including powerful analytic capabilities.
Fortunately, for those interested in exploring cloud options for mainframe environments, there are many paths forward and no need to make “bet the business” investments. On the contrary, cloud options are typically modular and granular, meaning you can choose many routes to the functionality you want while starting small and expanding when it makes sense.
Areas most often targeted for cloud experimentation include
- Analytics – Mainframe environments have an abundance of data but can’t readily provide many of the most-demanded BI and analytics services. Meanwhile, all across the business, adoption of cloud-based analytics has been growing but without direct access to mainframe data, it has not reached its full potential. Data locked in the mainframe has simply not been accessible. Making mainframe data cloud-accessible is a risk-free first step for modernization that can quickly and easily multiply the options for leveraging key data, delivering rapid and meaningful rewards in the form of scalable state-of-the-art analytics.
- Backup – Mainframe environments know how to do backup but they often face difficult tradeoffs when mainframe resources are needed for so many critical tasks. Backup often gets relegated to narrow windows of time. Factors such as reliance on tape, or even virtual tape, can also make it even more difficult to achieve needed results. In contrast, a cloud-based backup, whether for particular applications or data or even for all applications and data, is one of the easiest use cases to get started with. Cloud-based backups can eliminate slow and bulky tape-type architecture. As a backup medium, cloud is fast and cost-effective, and comparatively easy to implement.
- Disaster recovery (DR) –The tools and techniques for disaster recovery vary depending on the needs of an enterprise and the scale of its budget but often include a secondary site. Of course, setting up a dedicated duplicate mainframe disaster recovery site comes with a high total cost of ownership (TCO). A second, slightly more affordable option, is a business continuity colocation facility, which may be shared among multiple companies and made available to one of them at a time of need. Emerging as a viable third option is a cloud-based BCDR capability that provides essentially the same capabilities as a secondary site at a much lower cost. Predefined service level agreements for a cloud “facility” guarantee a quick recovery, saving your company both time and money.
- Archive – Again, existing mainframe operations often rely on tape to store infrequently accessed data, typically outside of the purview of regular backup activities. Sometimes this is just a matter of retaining longitudinal corporate data but many sectors such as the financial and healthcare industries which are heavily regulated are required to retain data for long durations of up to 10 years or more. As these collections of static data continue to grow, keeping it in “prime real estate” in the data center becomes less and less appealing. At the same time, few alternatives are appealing because they often involve transporting physical media. The cloud option, of course, is a classic “low-hanging fruit” choice that can eliminate space and equipment requirements on-premises and readily move any amount of data to low-cost and easy-to-access cloud storage.
A Painless Path for Mainframe Administrators
If an administrator of a cloud-based data center was suddenly told they needed to migrate to a mainframe environment, their first reaction would probably be panic! And with good reason. Mainframe is a complex world that requires layers of expertise. On the other hand, if a mainframe administrator chooses to experiment in the cloud or even begin to move data or functions into the cloud, the transition is likely to be smoother. That is not to say that learning isn’t required for the cloud but, in general, cloud practices are oriented toward a more modern, self-service world. Indeed, cloud growth has been driven in part by ease of use.
Odds are good, someone in your organization has had exposure to cloud, but courses and self-study options abound. Above all, cloud is typically oriented toward learn-by-doing, with free or affordable on-ramps that let individuals and organizations gain experience and skills at low cost.
In other words, in short order, a mainframe shop can also develop cloud competency. And, for the 2020s, that’s likely to be a very good investment of time and energy.
The COVID-19 pandemic has presented many challenges for mainframe-based organizations, in particular the need to conduct on-site hardware and software maintenance. That has especially led to consideration of cost-effective cloud data management options as an alternative to legacy mainframe storage platforms.
Evidence is abundant. In recent months, businesses have learned that cloud growth is largely pandemic-proof and may actually have accelerated. According to Synergy Research Group, Q1 spend on cloud infrastructure services reached $29 billion, up 37% from the first quarter of 2019. Furthermore, according to Synergy, anecdotal evidence points to some COVID-19-related market growth as additional enterprise workloads appear to have been pushed onto public clouds. And, according to a recent article in Economic Times, IBM Chief Executive Officer Arvind Krishna has indicated that the pandemic has also heightened interest in hybrid cloud, with transformations that were planned to last for years now being compressed into months.
For organizations built around mainframe technology, these trends underscore an opportunity that has become an urgent need during the pandemic, namely the question of how to reduce or eliminate dependence on on-premises storage assets such as physical tapes – which are still the main reason for needing personnel to access on-prem equipment.
Although essential for many daily operations as well as for routine backup or disaster recovery, these physical assets depend too heavily on having access to a facility and having trained personnel available on site.
Furthermore, they can be expensive to acquire, maintain, and operate. Additionally, on-prem hardware depends on other physical infrastructure that requires on-site maintenance such as air conditioning and electrical power systems. That was a tolerable situation in the past, but the reality of the pandemic, with lockdowns, transportation problems and the potential health threats to staff is leading mainframe operators to consider modern alternatives to many traditional on-prem mainframe storage options.
Cloud data manager for mainframe
One industry-leading example is the Model9 Cloud Data Manager for Mainframe, which securely delivers mainframe data to any cloud or commodity storage platform, eliminating the dependency on physical and virtual tape alike. It leverages reliable and cost-effective cloud storage for mainframe backup, archive, recovery, and space management purposes.
Needless to say, in the event of emergencies ranging from natural disasters to global pandemics, having mainframe data secured in the cloud means organizations can count on continued smooth operations with few if any people required on site. And, on an ongoing basis, Model9 also helps unlock mainframe data by transforming it to universal formats that can be used by advanced cloud-based analytics tools, providing more options for making use of corporate data.
It is a future-proof approach that enhances mainframe operations, adds resiliency, helps control costs, and provides a path to better leverage corporate information in a more flexible and cost-effective manner.
The recently posted Computer Weekly article, “Mainframe storage: Three players in a market that’s here to stay”, did a good job of describing the central players in mainframe disk storage but neglected to mention other types of mainframe storage solutions such as tapes and cloud data management.
In particular, the article omitted mention of one of the biggest opportunities for mainframe storage modernization and cost reduction, namely leveraging the cloud to reduce the footprint and cost of the petabytes of data still locked in various kinds of on-premises tape storage. Model9 currently offers the key to this dilemma by eliminating the dependency on FICON connectivity for mainframe secondary storage. This means, specifically, that mainframe-based organizations can finally gain real access to reliable and cost-effective on-premises and cloud storage from Cohesity, NetApp, Amazon Web Services, Microsoft Azure, Google Cloud Platform, etc. that until now could not be considered due to the proprietary nature of traditional mainframe storage. And, while keeping mainframe as the core system that powers transactions, its data can be accessible for analytics, BI and any other cloud application.
Surely, this is major news for such a key part of the computing market that has hitherto been essentially monopolized by the three players author Antony Adshead discussed at length.
Mainframe professionals know that new technologies can help them achieve even more; they deserve guidance with regard to the wide options opening up for them.
Enterprises are generating huge volumes of data every year with an average annual data growth of 40-50%. This growth has to be handled using IT budgets that are only growing at an annual average of 7%. Such disproportion creates a challenge for mainframe professionals: how can they store all this data cost-effectively?
Particularly challenging is deciding on the right strategy for long-term storage, also known as cold storage, for archived data that is rarely or never accessed. There can be different causes for keeping such data for the long term, which often lasts years or even decades:
- Financial data is stored for compliance and might be required in case of an audit
- Legal information must be kept in case of legal action
- Medical archives are stored in vast quantities and their availability is highly regulated
- Government data has to be stored for legal reasons, sometimes even indefinitely
- Raw data is stored by many enterprises for future data mining and analysis
Desired attributes of a cold storage solution
Cold storage, also referred to as “Tier 3 storage,” has different needs than Tier 0 (high-performance), Tier 1 (primary), and Tier 2 (secondary) storage. These are some of the considerations to keep in mind when designing your cold storage solution:
- Scalability – As the amount of generated data doubles in less than two years on average, your cold storage technology needs to be infinitely scalable accordingly.
- Cost – Cold storage must be as inexpensive as possible especially because you will need a lot of it. Luckily, as it is rarely accessed it allows compromising on accessibility and performance, which can be leveraged to reduce cost.
- Durability and Reliability – Reliability is the ability of a storage media not to fail within its durability time frame. Both are important to check, and you will find that some cold storage options are durable but not necessarily as reliable as others, and vice versa.
- Accessibility – Cold storage is meant only for data that does not need to be accessed very often or very rapidly, yet the ability to access it is still important. As mentioned above, compromising on this aspect enables a lower cost.
- Security – The security of cold data is vital. If it is stored onsite you need to take the same security precautions as with your active data. If it is in the cloud, you must ensure the vendor has proper security mechanisms in place.
Cold storage technology options for mainframe
Mainframe professionals have three general technology options when it comes to cold storage: tape, virtual tape, and cloud. While tapes are still the dominant cold storage media for mainframes, cloud is gaining momentum with its virtually limitless storage and pay-as-you-go model.
Here is a summary of these technologies, and their relative advantages and disadvantages:
Tape drives store data on magnetic tapes and are typically used for offline, archival data. Despite many end-of-life forecasts, the tape market is still growing at a CAGR of 7.6% and is expected to reach $6.5 billion by 2022. Tapes are considered the most reliable low-cost storage medium and if maintained properly can last for years. However, they are also the most difficult to access and it can be quite an ordeal to recover from tapes in case of disaster.
Pros of Tape:
- Often cheaper than other options, depending on the use case
- Full control over where data is stored
- Secure and not susceptible to malware or viruses as it is offline
- Portable and can be carried or sent anywhere
- Easy to add capacity
Cons of Tape:
- Capital investment required for large tape libraries
- Difficult to access (slow and with bottlenecks)
- High recovery time objective (RTO)
- Requires physical access and manual handling (problematic in lockdown, for example)
- Requires careful maintenance
Virtual Tape Libraries (VTL)
A VTL is a storage system made up of hard disk drives (HDDs) that appears to the backup software as traditional tape libraries. While not as cheap as tape, HDDs are relatively inexpensive per gigabyte. They are easier to access than tape and their disks are significantly faster than magnetic tapes (although data is still written sequentially).
Pros of VTL:
- Scalability – HDDs added to a VTL are perceived as tape storage to the mainframe
- Performance – data access is faster than tape or cloud
- Compatibility – works with tape software features like deduplication
- Familiarity – behaves like traditional tape libraries
Cons of VTL:
- Cost varies. Infrastructure, maintenance, and skilled admins should also be considered
- Capital investment required
- Usually less reliable than other options
- Less secure than offline tapes and lacks the latest security features of cloud platforms
Cold storage in the cloud is maintained by third-party service providers in a pay-as-you-go model. Rather than selling products, they charge for usage of storage space, bandwidth, data access, and the like. Cloud is becoming extremely popular for cold storage, mainly because it is considerably cheaper than on-prem storage. Pay-as-you-go means that it can start at affordable prices without needing to stock up on tapes and VTLs anymore. There is also no more need to maintain infrastructure or recruit personnel to manage data archives, as these are all handled by the cloud vendor. The cloud provides superior agility and scalability, and although magnetic tapes are more secure it also provides higher levels of security and compliance than many businesses can on their own. When it comes to durability, the cloud really excels by storing data redundantly across many different storage systems. On the downside, administrators need to consider network bandwidth and the cost of uploads and restores, as using cloud is often more expensive than it appears at first glance. The leading vendors of long-term cloud storage are Amazon (Glacier and Glacier Deep Archive), Google (Cloud Storage Nearline and Cloud Storage Coldline), Microsoft (Azure Archive Blob Storage), and Oracle (Archive Storage). These vendors charge low rates for storage space but extra fees for bringing data back on-premises, which might prove costly if too much data is retrieved.
Pros of Cloud:
- Can be cheaper, especially when being aware of hidden costs
- Can improve cash flow thanks to an OpEx financial model rather than CapEx
- Infinitely scalable
- Accessible from anywhere
- Advanced data management
- High data redundancy and easy replication
- Leading-edge security
- Easy to integrate with mainframes
Cons of Cloud:
- Hidden costs (depends on use)
- Data retrieval, backup, and RTO times depend on network bandwidth
Cloud is Rising as a Mainframe Cold Storage Choice
The cloud storage market is expected to reach $88.91 billion by 2022 growing at a CAGR of 23.7%—much higher than the CAGRs of all the other cold storage options combined. Cold storage in the cloud offers a unique combination of scalability, reliability, durability, security, and cost-effectiveness that on-prem options are challenged to meet.
So, in which cases cloud is preferable for cold storage over tape and VTL?
- When data access frequency changes: The cloud offers different cold storage tiers, based on the data access requirements, that balance between data storage cost and the data access frequency. Cold storage tiers can be cost effective, however with high data access frequency you need to be mindful of choosing a service that addresses those access needs.
- When the data grows quickly or unpredictably: Cloud platforms can scale to infinity with very little effort, unlike on-prem options.
- When improving cash flow is a priority: Predictable OpEx monthly fees can improve cash flow compared to large upfront investment in on-prem storage and infrastructure.
- In case of mainframe skills shortage: Attracting and retaining mainframe experts is a challenge to many enterprises. With cloud cold storage, this problem completely goes away.