Category: Mainframe infrastructure
One of the great revelations for those considering new or expanded cloud adoption is the cost factor – especially with regard to storage. The received wisdom has long been that nothing beats the low cost of tape for long-term and mass storage.
In fact, though tape is still cheap, cloud options are getting very close such as with Amazon S3 Glacier Deep Archive, and offer tremendous advantages that tape can’t match. A case in point is Amazon S3 Intelligent-Tiering.
Tiering (also called hierarchical storage management or HSM) is not new. It’s been part of the mainframe world for a long time, but with limits imposed by the nature of the storage devices involved and the software. According to Amazon, Intelligent Tiering helps to reduce storage costs by up to 95 percent and now supports automatic data archiving. It’s a great way to modernize your mainframe environment by simply moving data to the cloud, even if you are not planning to migrate your mainframe to AWS entirely.
How does Intelligent-Tiering work? The idea is pretty simple. When objects are found to have been rarely accessed over long periods of time, they are automatically targeted for movement to less expensive storage tiers.
Migrate Mainframe to AWS
In the past (both in mainframes and in the cloud) you had to define a specific policy stating what needed to be moved to which tier and when, for example after 30 days or 60 days. The point with the new AWS tiering is that it automatically identifies what needs to be moved, when, and then moves it at the proper time. To migrate mainframe to Amazon S3 is no problem because modern data movement technology now allows you to move both historical and active data directly from tape or virtual tape to Amazon S3. Once there, auto-tiering can transparently move cold and long-term data to less expensive tiers.
This saves the trouble of needing to specifically define the rules. By abstracting the cost issue, AWS simplifies tiering and optimizes the cost without impacting the applications that read and write the data. Those applications can continue to operate under their usual protocols while AWS takes care of selecting the optimal storage tier. According to AWS, this is the first and, at the moment, the only cloud storage that delivers this capability automatically.
When reading from tape, the traditional lower tier for mainframe environments, recall times are the concern as the system has to deal with tape mount and search protocols. In contrast, Amazon S3 Intelligent-Tiering can provide a low millisecond latency as well as high throughput whether you are calling for data in the Frequent or Infrequent access tiers. In fact, Intelligent-Tiering can also automatically migrate the most infrequently used data to Glacier, the durable and extremely low-cost S3 storage class for data archiving and long-term backup. And with new technology allowing efficient and secure data movement over TCP/IP, getting mainframe data to S3 is even easier.
The potential impact on mainframe data practices
For mainframe-based organizations this high-fidelity tiering option could be an appealing choice compared with tape from both a cost and benefits perspective. However, the tape comparison is rarely that simple. For example, depending on the amount of data involved and the specific backup and/or archiving practices, any given petabyte of data needing to be protected may have to be copied and retained two or more times, which immediately makes tape seem a bit less competitive. Add overhead costs, personnel, etc., and the “traditional” economics may begin to seem even less appealing.
Tiering, in a mainframe context, is often as much about speed of access as anything else. So, in the tape world, big decisions have to be made constantly about what can be relegated to lower tiers and whether the often much-longer access times will become a problem after that decision has been made. But getting mainframe data to S3, where such concerns are no longer an issue, is now easy. Modern data movement technology means you can move your mainframe data in mainframe format directly to object storage in the cloud so it is available for restore directly from AWS.
Many mainframe organizations have years, even decades of data on tape. The management of this tape data is retained only in the tape management system. Or perhaps it was just copied forward from a prior tape system upgrade. How much of this data is really needed? Is it even usable anymore? To migrate mainframe to AWS, specifically this older data, allows management of the data in a modern way and can reduce the amount of tape data on-premises.
And what about those tapes that today are shipped off-site for storage and recovery purposes? Why not put that data on cloud storage for recovery anywhere?
For mainframe organizations interested in removing on-premise tape technology, reducing tape storage sizes, or creating remote backup copies, cloud options like Amazon S3 Intelligent Tiering can offer cost optimization that is better “tuned” to an organization’s real needs than anything devised manually or implemented on-premises. Furthermore, with this cloud-based approach, there is no longer any need to know your data patterns or think about tiering, it just gets done.
Best of all, you can now perform a stand-alone restore directly from cloud. This is especially valuable with ransomware attacks on the rise because there is no dependency on a potentially compromised system.
You can even take advantage of AWS immutable copies and versioning capabilities to further protect your mainframe data.
Of course, in order to take advantage of cloud storage like Amazon S3 Intelligent Tiering, you need to find a way to get your mainframe data out of its on-premises environment. Traditionally, that has presented a big challenge. But, as with multiplying storage options, the choices in data movement technology are also improving. For a review of new movement options, take a look at a discussion of techniques and technologies for Mainframe to Cloud Migration.
For mainframe shops that need to move data on or off the mainframe, whether to the cloud or to an alternative on-premises destination, FICON, the IBM mainstay for decades, is generally seen as the standard, and with good reason. When it was first introduced in 1998 it was a big step up from its predecessor ESCON that had been around since the early 1990s. Comparing the two was like comparing a firehose to a kitchen faucet.
FICON is fast, in part, because it runs over Fibre Channel in an IBM proprietary form defined by ANSI FC-SB-3 Single-Byte Command Code Sets-3 Mapping Protocol for Fibre Channel (FC) protocol. In that schema it is a FC layer 4 protocol. As a mainframe protocol it is used on IBM Systems Z to handle both DASD and tape I/O. It is also supported by other vendors of disk and tape storage and switches designed for the IBM environment.
Over time, IBM has increased speeds and added features such as High Performance FICON, without significantly enhancing the disk and tape protocols that traverse over it; meaning these limitations on data movement remain. For this reason, the popularity and a long-history of FICON does not make it the answer for every data movement challenge.
Stuck in the Past
One challenge, of particular concern today, is that mainframe secondary storage is still being written to tape via tape protocols, whether it is real physical tape or virtual tape emulating actual tape. With tape as a central technology, it implies dealing with tape mount protocols and tape management software to maintain where datasets reside on those miles of Mylar. The serial nature of tape and limitations of the original hardware required large datasets to often span multiple tape images.
Though virtual tapes written to DASD improved the speed of writes and recalls, the underlying protocol is still constrained by tape’s serialized protocols. This implies waiting for tape mounts and waiting for I/O cycles to complete before next data can be written. When reading back, the system must traverse through the tape image to find the specific dataset requested. In short, while traditional tape may have its virtues, speed – the 21st century speed of modern storage – is not among them. Even though tape and virtual tape is attached via FICON, the process of writing and recalling data relies on the underlying tape protocol for moving data, thus making FICON attached less-than-ideal for many modern use cases.
Faster and Better
But there is an alternative that doesn’t rely on tape or emulate tape because it does not have to.
Instead, software generates multiple streams of data from a source and pushes data over IBM Open Systems Adapter (OSA) cards using TCP/IP in an efficient and secure manner to an object storage device, either on premise or in the cloud. The Open Systems Adapter functions as a network controller that supports many networking transport protocols, making it a powerful helper for this efficient approach to data movement. Importantly, as an open standard, OSA is developing faster than FICON. For example, with the IBM z15 there is already a 25GbE OSA-Express7S card, while FICON is still at 16Gb with the FICON Express16 card.
While there is a belief common among many mainframe professionals that OSA cards are “not as good as FICON,” that is simply not true when the necessary steps are taken to optimize OSA throughput.
To achieve better overall performance, the data is captured well before tape handling, thus avoiding the overhead of tape management, tape mounts, etc. Rather than relying on serialized data movement, this approach breaks apart large datasets and sends them across the wire in simultaneous chunks, while also pushing multiple datasets at a time. Data can be compressed prior to leaving the mainframe and beginning its journey, reducing the amount of data that would otherwise be written. Dataset recalls and restores are also compressed and use multiple streams to ensure quick recovery of data from the cloud.
Having the ability to write multiple streams further increases throughput and reduces latency issues. In addition, compression on the mainframe side dramatically reduces the amount of data sent over the wire. If software is also designed to run on zIIP engines within the mainframe, data discovery and movement as well backup and recovery workloads will consume less billable MIPS and TCP/IP cycles also benefit.
This approach delivers mainframe data to cloud storage, including all dataset types and historical data, in a quick and efficient manner. And this approach can also transform mainframe data into standard open formats that can be ingested by BI and Analytics off of the mainframe itself, with a key difference. When data transformation occurs on the cloud side, no mainframe MIPS are used to transform the data. This allows for the quick and easy movement of complete datasets, tables, image copies, etc. to the cloud, then makes all data available to open applications by transforming the data on the object store.
A modern, software-based approach to data movement means there is no longer a need to go to your mainframe team to update the complex ETL process on the mainframe side.
To address the problem of hard-to-move mainframe data, this software-based approach provides the ability to readily move mainframe data and, if desired, readily transform it to common open formats. This data transformation is accomplished on the cloud side, after data movement is complete, which means no MF resources are required to transform the data.
- Dedicated software quickly discovers (or rediscovers) all data on the mainframe. Even with no prior documentation or insights, Model9 can rapidly assemble and map the data to be moved, expediting both modernization planning and data movement.
- Policies are defined to move either selected data sets or all data sets automatically, reducing oversight and management requirements dramatically as compared to other data movement methods.
- For the sake of simplicity, a software approach can be designed to invoke actions via a RESTful API, or a management UI, as well as from the Mainframe side via a traditional batch or command line,
- A software approach can also work with targets both on premises or in the cloud.
In summary, a wide-range of useful features can make data movement with a software-based approach intuitive and easy. By avoiding older FICON and tape protocols, a software-based approach can push mainframe data over TCP/IP to object storage in a secure and efficient manner, making it the answer to modern mainframe data movement challenges!
The recently posted Computer Weekly article, “Mainframe storage: Three players in a market that’s here to stay”, did a good job of describing the central players in mainframe disk storage but neglected to mention other types of mainframe storage solutions such as tapes and cloud data management.
In particular, the article omitted mention of one of the biggest opportunities for mainframe storage modernization and cost reduction, namely leveraging the cloud to reduce the footprint and cost of the petabytes of data still locked in various kinds of on-premises tape storage. Model9 currently offers the key to this dilemma by eliminating the dependency on FICON connectivity for mainframe secondary storage. This means, specifically, that mainframe-based organizations can finally gain real access to reliable and cost-effective on-premises and cloud storage from Cohesity, NetApp, Amazon Web Services, Microsoft Azure, Google Cloud Platform, etc. that until now could not be considered due to the proprietary nature of traditional mainframe storage. And, while keeping mainframe as the core system that powers transactions, its data can be accessible for analytics, BI and any other cloud application.
Surely, this is major news for such a key part of the computing market that has hitherto been essentially monopolized by the three players author Antony Adshead discussed at length.
Mainframe professionals know that new technologies can help them achieve even more; they deserve guidance with regard to the wide options opening up for them.
Enterprises are generating huge volumes of data every year with an average annual data growth of 40-50%. This growth has to be handled using IT budgets that are only growing at an annual average of 7%. Such disproportion creates a challenge for mainframe professionals: how can they store all this data cost-effectively?
Particularly challenging is deciding on the right strategy for long-term storage, also known as cold storage, for archived data that is rarely or never accessed. There can be different causes for keeping such data for the long term, which often lasts years or even decades:
- Financial data is stored for compliance and might be required in case of an audit
- Legal information must be kept in case of legal action
- Medical archives are stored in vast quantities and their availability is highly regulated
- Government data has to be stored for legal reasons, sometimes even indefinitely
- Raw data is stored by many enterprises for future data mining and analysis
Desired attributes of a cold storage solution
Cold storage, also referred to as “Tier 3 storage,” has different needs than Tier 0 (high-performance), Tier 1 (primary), and Tier 2 (secondary) storage. These are some of the considerations to keep in mind when designing your cold storage solution:
- Scalability – As the amount of generated data doubles in less than two years on average, your cold storage technology needs to be infinitely scalable accordingly.
- Cost – Cold storage must be as inexpensive as possible especially because you will need a lot of it. Luckily, as it is rarely accessed it allows compromising on accessibility and performance, which can be leveraged to reduce cost.
- Durability and Reliability – Reliability is the ability of a storage media not to fail within its durability time frame. Both are important to check, and you will find that some cold storage options are durable but not necessarily as reliable as others, and vice versa.
- Accessibility – Cold storage is meant only for data that does not need to be accessed very often or very rapidly, yet the ability to access it is still important. As mentioned above, compromising on this aspect enables a lower cost.
- Security – The security of cold data is vital. If it is stored onsite you need to take the same security precautions as with your active data. If it is in the cloud, you must ensure the vendor has proper security mechanisms in place.
Cold storage technology options for mainframe
Mainframe professionals have three general technology options when it comes to cold storage: tape, virtual tape, and cloud. While tapes are still the dominant cold storage media for mainframes, cloud is gaining momentum with its virtually limitless storage and pay-as-you-go model.
Here is a summary of these technologies, and their relative advantages and disadvantages:
Tape drives store data on magnetic tapes and are typically used for offline, archival data. Despite many end-of-life forecasts, the tape market is still growing at a CAGR of 7.6% and is expected to reach $6.5 billion by 2022. Tapes are considered the most reliable low-cost storage medium and if maintained properly can last for years. However, they are also the most difficult to access and it can be quite an ordeal to recover from tapes in case of disaster.
Pros of Tape:
- Often cheaper than other options, depending on the use case
- Full control over where data is stored
- Secure and not susceptible to malware or viruses as it is offline
- Portable and can be carried or sent anywhere
- Easy to add capacity
Cons of Tape:
- Capital investment required for large tape libraries
- Difficult to access (slow and with bottlenecks)
- High recovery time objective (RTO)
- Requires physical access and manual handling (problematic in lockdown, for example)
- Requires careful maintenance
Virtual Tape Libraries (VTL)
A VTL is a storage system made up of hard disk drives (HDDs) that appears to the backup software as traditional tape libraries. While not as cheap as tape, HDDs are relatively inexpensive per gigabyte. They are easier to access than tape and their disks are significantly faster than magnetic tapes (although data is still written sequentially).
Pros of VTL:
- Scalability – HDDs added to a VTL are perceived as tape storage to the mainframe
- Performance – data access is faster than tape or cloud
- Compatibility – works with tape software features like deduplication
- Familiarity – behaves like traditional tape libraries
Cons of VTL:
- Cost varies. Infrastructure, maintenance, and skilled admins should also be considered
- Capital investment required
- Usually less reliable than other options
- Less secure than offline tapes and lacks the latest security features of cloud platforms
Cold storage in the cloud is maintained by third-party service providers in a pay-as-you-go model. Rather than selling products, they charge for usage of storage space, bandwidth, data access, and the like. Cloud is becoming extremely popular for cold storage, mainly because it is considerably cheaper than on-prem storage. Pay-as-you-go means that it can start at affordable prices without needing to stock up on tapes and VTLs anymore. There is also no more need to maintain infrastructure or recruit personnel to manage data archives, as these are all handled by the cloud vendor. The cloud provides superior agility and scalability, and although magnetic tapes are more secure it also provides higher levels of security and compliance than many businesses can on their own. When it comes to durability, the cloud really excels by storing data redundantly across many different storage systems. On the downside, administrators need to consider network bandwidth and the cost of uploads and restores, as using cloud is often more expensive than it appears at first glance. The leading vendors of long-term cloud storage are Amazon (Glacier and Glacier Deep Archive), Google (Cloud Storage Nearline and Cloud Storage Coldline), Microsoft (Azure Archive Blob Storage), and Oracle (Archive Storage). These vendors charge low rates for storage space but extra fees for bringing data back on-premises, which might prove costly if too much data is retrieved.
Pros of Cloud:
- Can be cheaper, especially when being aware of hidden costs
- Can improve cash flow thanks to an OpEx financial model rather than CapEx
- Infinitely scalable
- Accessible from anywhere
- Advanced data management
- High data redundancy and easy replication
- Leading-edge security
- Easy to integrate with mainframes
Cons of Cloud:
- Hidden costs (depends on use)
- Data retrieval, backup, and RTO times depend on network bandwidth
Cloud is Rising as a Mainframe Cold Storage Choice
The cloud storage market is expected to reach $88.91 billion by 2022 growing at a CAGR of 23.7%—much higher than the CAGRs of all the other cold storage options combined. Cold storage in the cloud offers a unique combination of scalability, reliability, durability, security, and cost-effectiveness that on-prem options are challenged to meet.
So, in which cases cloud is preferable for cold storage over tape and VTL?
- When data access frequency changes: The cloud offers different cold storage tiers, based on the data access requirements, that balance between data storage cost and the data access frequency. Cold storage tiers can be cost effective, however with high data access frequency you need to be mindful of choosing a service that addresses those access needs.
- When the data grows quickly or unpredictably: Cloud platforms can scale to infinity with very little effort, unlike on-prem options.
- When improving cash flow is a priority: Predictable OpEx monthly fees can improve cash flow compared to large upfront investment in on-prem storage and infrastructure.
- In case of mainframe skills shortage: Attracting and retaining mainframe experts is a challenge to many enterprises. With cloud cold storage, this problem completely goes away.
New Jersey Governor Phil Murphy’s open call for COBOL programmers because of system failures in supporting unemployment benefits processing and distribution is completely missing the point.
The fact that the mainframe system could not handle the workload and increase in demand is not the fault of the app and it’s certainly not an issue of the app’s programming language. It is a matter of upgrading the mainframe’s infrastructure to sustain the increased workload.
Governments and institutions have allowed their systems to stagnate, neglecting to invest in agile, newer technologies with greater scalability to keep up with increased workload. In fact, to date the system was working well with most believing “if it ain’t broke…,” don’t bother to “fix” it.
The challenges lie with a lack of maintenance and modernization of the mainframe’s infrastructure. If there’s any type of skill shortage, it is that of mainframe system programmers whose mainframe expertise is unique due to the proprietary nature of the system. Any dependency on a unique, proprietary set of skills is a risk for any organization and, therefore, the resolution lies in opening up the system to cloud-native, modern architectures.
To summarize, had organizations invested in integrating cloud technology with their mainframe infrastructure, they would have been benefiting from quick scaling and fast-paced app development on the cloud side to process their mainframe data.
In today’s current crisis, the cloud serves to decrease the dependency of on-prem hardware and infrastructure, and offloads the work from the mainframe infrastructure to increase capacity.