Category: Mainframe modernization
We did the impossible and the unlikely. Venture capital players are all about the newest and most innovative technologies and market opportunities. They rarely consider plays in mature markets – and mainframe technology has long been seen as a mature market; some would have even said a dying market.
We saw things differently and birthed a technology that gives mainframe organizations the power to engage cloud and to build on their institutional strength and proven mainframe technology while engaging with the most advanced cloud capabilities and ultra-agile business practices.
Customers were thirsty for this kind of innovation and we knew we had to move fast if we were going to compete with incumbent giants such as IBM, EMC, Broadcom/CA, and Oracle. The answer had to be venture capital to give us the ability to scale quickly, achieve market penetration, and become the go-to solution for a global base of potential enterprise customers.
Venture capital has been as important as invention and scientific breakthroughs. It gave birth to most of the computer industry (other than IBM), and gave us innumerable consumer brands such as Waze, Mobileye, Uber, and even the emergence of autonomous cars. VCs are special people – many are veterans of startups that know about ramping up, know about market windows, and know that the ability to scale is a matter of life and death in tech markets.
It wasn’t easy for the Model9 founders. We could see eyes roll a bit when we explained that we were tackling the mainframe market. But once we got folks to understand it’s a huge opportunity and that we really have the power to unlock much of the world’s corporate data, the power to put data owners back in control and break a lock on data that sometimes spanned generations, and the power to bring cloud capabilities to the needs of the enterprise – they started to listen and listen closely.
And, like other’s that have won the nod from VCs, we demonstrated that we had the team, the experts and the drive, to take a modest startup and its innovative technology, and use it to transform the huge enterprise computing market.
We also had to clarify how we differed from a wide spectrum of VC-backed storage and data management companies that are transforming industries (WekaIO, VAST, Pure, Cohesity…), but none touch the mainframe, despite its great importance and central role in the enterprise market and world economics in general. Could it be, they wondered, that we were the first to crack open a market long monopolized by a few big players? Were we really doing something that unique?
Yes, we were – and they saw the light.
The lesson in all this may be that opportunities can be right in front of you and still be overlooked. The mainframe world can seem arcane, closed, and unfamiliar to most who have grown up in the PC/Server/Web/device era. But it remains an arena of action, absorbing billions of dollars in spending each year and contributing directly to trillions of dollars in economic activity.
And, surprisingly, it is a green field for innovation and VC investing. It has been a long time since this corner of IT was viewed seriously as a field for VC investment but it is finally being viewed not as IT “island” but, much more accurately, as a “supercontinent,” waiting to be fully integrated and optimized with the rest of our dynamic, information-rich and connected society.
For VCs this should spell opportunity, but many of them and nearly all of the players in the startup world, have grown up and profited entirely within the non-mainframe world. We got lots of those blank stares when we started out, until our persistence led us to Intel Capital, StageOne Ventures, North First Ventures and GlenRock, who grasped the importance of the mainframe market and saw our potential — and we’re grateful that they have.
We hope more venture firms will follow them to this land of opportunity.
For many mainframers, the concept of writing to object storage from zSeries mainframes over TCP/IP is a new concept.
The ease of use and the added value of implementing this solution is clear, but there is another question: What to use as a target repository? How do customers decide on a vendor for object storage and whether a private cloud, hybrid cloud, or public cloud should be used? Target repositories can be either an on-prem object storage system, like Hitachi HCP, Cohesity, or a public cloud, such as AWS, Azure or GCP.
The best option for you depends on your individual needs. There are pros and cons in each case. In this post, we break down the factors you need to consider so you know how to choose a target cloud repository that will meet your needs.
- Network bandwidth and external connections
- Amount of data recalled, restored or read back from repository
- DR testing and recovery plans
- Corporate strategies, such as “MF Modernization” or “Cloud First”
- Cloud acceptance or preferred cloud vendor already defined
- Cyber resiliency requirements
- Floor or rack space availability
Network bandwidth and external connections
Consider the bandwidth of the OSA cards and external bandwidth to remote cloud, if cloud is an option. Is the external connection shared with other platforms? Is a cloud connection already established for the corporation?
For on-premise storage, network connectivity is required, yet it is an internal network with no external access
Amount of data recalled, restored or read back from repository
There are added costs for reading data back from the public cloud, so an understanding of expected read throughput is important when comparing costs. If the read rate is high, then consider an on-premise solution.
DR testing and recovery plans
Cloud-based recovery allows recovery from anywhere. And public clouds can replicate data across multiple sites automatically. The disaster recovery or recovery site must have network connectivity to the cloud.
On-premise solution requires a defined disaster recovery setup, a second copy of the object storage off-site that is replicated from the primary site. Recovery at the DR site will access this replicated object storage.
Corporate strategies, such as “Mainframe Modernization” or “Cloud First”
You should be able to quickly move mainframe data to cloud platforms by modernizing backup and archive functions. Cloud also offers either policy-driven and/or automatic tiering of data to lower the cost of cold storage.
If there is no cloud initiative, the on-premise solution may be preferred. Many object storage providers have options to push the data from on-premise to public cloud. So, hot data can be close and cold data can be placed on clouds.
Cloud acceptance or preferred cloud vendor already defined
Many corporations already have a defined cloud strategy and a cloud vendor of choice. You’ll want a vendor agnostic solution.
The knowledge of defining the repository and maintaining it could be delegated to other groups within the organization familiar with and responsible for the corporate cloud.
Cyber resiliency requirements
On-premise solutions can generate immutable snapshots to protect against cyber threats. An air-gapped solution can be architected to place copies of data on a separate environment that can be detached from networks.
Cloud options also include things like versioning, multiple copies of data, and multi-authentication to protect data and allow recovery.
Floor or rack space availability
With an on-premise solution, floor space, rack space, power, etc is required. With a cloud solution, no on-premise hardware is required
There is no clear-cut performance benefit for either solution. It depends upon the hardware and network resources and the amount of data to be moved and contention from other activity in the shop using the same resources.
Cloud customers with performance concerns may choose to establish a direct connection to cloud providers in local regions to prevent latency issues. These concerns are less relevant when a corporate cloud strategy is already in place.
Cloud storage is priced by repository size and type. There are many add-on costs for features and costs for reading back. There are mechanisms to reduce costs, such as tiering data. Understanding these costs upfront is important.
On-premise object storage requires at minimum two systems for redundancy, installation and maintenance.
Mainframe modernization is a top priority for forward-thinking I&O leaders who don’t want to be left behind by rapid technological change. But moving applications to the cloud has proven to be difficult and risky, which can lead to analysis-paralysis or initiatives that run over budget and fail. This is slowing the pace of adaptation, starving cloud functions of access to mainframe data, and often inhibiting any positive action at all.
So, at the moment, most mainframe data is effectively siloed, cut off from BI or AI cloud applications and data lakes in the cloud and simply locked in a “keep the lights on” mentality that is dangerous if continued too long.
Part of the problem is that mainframe organizations have focused on an application-first approach to cloud engagement which is usually the wrong approach because cost and complexity get in the way. Leaders should instead take a data-first approach that allows them to begin modernization by liberating mainframe data and moving storage and even their backup and recovery functions to the cloud. This has the benefit of making mainframe data immediately accessible in the cloud without requiring any mainframe application changes.
Why Is Mainframe Modernization So Hard?
The mainframe environment has been aptly described as a walled garden. Everything inside that garden works well, but the wall makes this Eden increasingly irrelevant to the rest of the world. And the isolation keeps the garden from reaching its full potential.
The walled garden is a result of the inherent nature of mainframe technology, which has evolved apart from the cloud and other on-prem environments. This means the architecture is fundamentally different, making a so-called lift-and-shift move to the cloud very difficult. Applications built for mainframe must stay on the mainframe and adapting them to other environments is often prohibitive. At an even more fundamental level, mainframe data is stored in forms that are incomprehensible to other environments.
How does Model9 Cloud Data Manager for Mainframe Work?
While mainframe lift-and-shift strategies may be very challenging, the movement of data to the cloud has suddenly gotten much easier thanks to Model9 Cloud Data Manager for Mainframe, which represents a fresh technology direction.
Our patented and proven technology takes mainframe data and moves it quickly and easily to the cloud and can then transform it in the cloud to almost any industry-standard form.
With Model9, mainframe data is first moved to an object storage target in a public cloud or a private cloud . The process is vendor agnostic and eliminates most of the traditional costs associated with mainframe ETL because it leverages zIIP engines to handle movement (over TCP/IP) and accomplishes the “transform” step in the cloud, without incurring MSU costs.
This can work with any mainframe data but is especially helpful for historic data and any data resident on tape or virtual tape, normally hard to access even for the mainframe itself.
The result is backup, archiving, and recovery options in the cloud that are cost-effective, faster, and easier to access than in traditional on-prem systems. And, Model9 has almost no impact on existing mainframe applications and operations. It is a data-first approach that allows you to transition mainframe data into the cloud with a software only solution
The Benefits Of Model9’s Data-first Approach
By focusing on the simpler task of moving mainframe data first, organizations gain multiple advantages including:
- Cost Reduction by reducing or eliminating the tape or VTL hardware footprint and associated mainframe software (DFSMShsm etc.), as well as reducing MSU charges
- Cloud can deliver a full data protection solution that can provide security and, “recover anywhere” capability.
- Cloud-based transformation immediately unlocks mainframe data for use in cloud applications.
- Cloud can also yield performance improvements such as reduced backup windows, reduced peak processing demand, and reduced system overhead.
Data First Works
Data-first mainframe modernization empowers leaders to broaden their cloud adoption strategy and secure more immediate benefits. It can accelerate cloud migration projects by leveraging non-mainframe skills and delivering simplified data movement. Organizations can readily maintain and access mainframe data after migrating to the cloud to meet corporate and regulatory requirements without the traditional high costs.
In addition, a data-first approach reduces the burden on your mainframe by offloading data to external platforms while empowering analytics and new applications in the cloud for some of your most valuable data.
According to Gartner, with new approaches to data and cloud, ‘Through 2024, 40% of enterprises will have reduced their storage and data protection costs by over 70% by implementing new storage technologies and deployment methods.’
Best of all, a data-driven approach allows organizations to combine the power of the mainframe with the scalability of the cloud and modernize on their own terms.
If you are still using a legacy VTL/Tape solution, you could be enjoying better performance by sending backup and archive copies of mainframe data directly to cloud object storage.
The reason for this is when you replace legacy technology with modern object storage, you can eliminate bottlenecks that throttle your performance. In other words, you can build a connection between your mainframe and your backup/archive target that can move data faster. You can think of this as “ingestion throughput.”
3 ways you can increase ingestion throughput for backup and archive copies of mainframe data
Here are the top three ways you can increase ingestion throughput:
#1: Write data in parallel, not serially
The legacy mainframe tapes used to make backup and archive copies required data to be written serially. This is because physical tape lived on reels, and you could only write to one place on the tape at a time. When VTL solutions virtualized tape, they carried over this sequential access limitation.
In contrast, object storage does not have this limitation and does not require data to be written serially. Instead, it is possible to use a new method to send multiple chunks of data simultaneously directly to object storage using TCP/IP.
#2: Use zIIP engines instead of mainframe MIPS
Legacy mainframe backup and archive solutions use MSUs, taking away from the processing available to other tasks on the mainframe. This in effect means that your mainframe backups are tying up valuable mainframe computing power, reducing the overall performance you can achieve across all the tasks you perform there.
You do not need to use MSUs to perform backup and archive tasks. Instead, you can use the mainframe zIIP engines—reducing the CPU overhead and freeing up MSUs to be used for other things.
#3: Compress data before sending it
Legacy mainframe backup and archive solutions do not support compressing data before sending it to Tape/VTL. This means that the amount of data that needs to be sent is much larger than it could be using modern compression techniques.
Rather it is possible to compress your data before sending it to object storage. Not only do you benefit from smaller data transfer sizes, but you can increase the effective capacity of your existing connection between the mainframe and storage target. For example, compressing data at a 3:1 ratio would effectively turn a 1GB line into a 3GB line—allowing you to send the same amount of data faster while still using your existing infrastructure.
Faster than VTL: Increase Mainframe Data Management Performance
Replacing your legacy VTL/Tape solution with a modern solution that can compress and move data to cloud-based object storage can significantly decrease the amount of time it takes to backup and archive your mainframe data, without increasing resource consumption.
Writing in parallel, leveraging zIIP engines, and employing compression is a low-risk, and high-reward option that leverages well-known, well-understood, and well-proven technologies to address a chronic mainframe challenge. This can yield immediate, concrete benefits such as reducing the amount of time it takes for you to backup and archive your mainframe data and cut costs while boosting capabilities.
Cloud-first strategies propose an optimistic ideal in which on-prem IT is a thing of the past and any imaginable function can be bought or created in the cloud. To be sure, this vision is based on a reality: There are many successful organizations that have been “born in the cloud” and many others that have successfully moved most or all functionality there, as well.
But I&O leaders of mainframe-based organizations, though often subjected to relentless questioning regarding potential financial benefits from moving to the cloud, know that the nature of mainframe and the “gravity” of the data and applications on premises, make a move to the cloud challenging, at best. For them, ‘cloud first’ can seem to be nothing but a chimera.
However, it doesn’t have to be that way. Modern tools put the cloud within reach as never before; initially as an adjunct to mainframe and, over the long term, perhaps even as a replacement if such a move actually makes business sense (and, it often does not!).
Cloud is best considered as part of a mainframe modernization effort. The very unique characteristics of mainframe and mainframe applications means that migrating them to the cloud is difficult, requiring refactoring and/or rearchitecting, which is time-consuming, expensive, and risky. So, approaches that strengthen the mainframe while engaging with cloud make sense.
Blocked by Siloed Mainframe Data
Reluctance to attempt actual application migration to the cloud is an acknowledgement that the default approach trends not towards cloud first, but instead a ‘Mainframe + Cloud’ strategy. But the result is mainframe data silos that limit business options and reduce the utility of the data.
Siloing your mainframe data has an immediate business impact. Your most valued data is excluded from some of the most important analytical tools available, in particular cloud-based data lakes that have become a key tool for enterprises striving for agility and insight.
That absence of data also significantly limits the potential ROI of any cloud adoption and integration strategy because cloud capabilities will be missing a critical portion of the universe of relevant data. And, ultimately, it leaves your company in a straightjacket, restricting the potential dynamism of your IT organizations.
Data-led Migrations are the Key
Fortunately, this problem has a solution. Rather than taking the old-school approach of attempting to migrate mainframe applications all at once or keeping your data siloed, there is a modern alternative. It is based on the prescient idea that data itself is the answer. Data gravity is the colloquial term for the insight that data has power wherever it is located and can be accessed. That’s true when the data is locked exclusively in the mainframe environment and it is also true if it can be moved to the cloud. Move the data, according to this insight, and functionality will naturally tend to follow.
Put another way, moving the data is what matters. Once the data is available in a different environment the organization will evolve ways to access and use that data – either by migrating applications or by choosing to adopt a cloud capability that can deliver the same results with less cost and trouble.
Model9 delivers the capability to move your data and empowers you to choose when and, equally important, how much. For example, you can start with archival data that is used infrequently in the mainframe but has potentially limitless value in an analytics context. By moving that data to the cloud, you can free storage capacity on-prem (potentially allowing you to eliminate tape and VTLs). The mainframe can still access the data when needed but analytic tools in the cloud may use it much more often.
With data gravity increasingly centered in the cloud, you are in charge. You can continue to support mainframe while gradually building new applications and functionality in the cloud. Or, the data is there if you eventually decide on a full lift and shift.
No matter the scale of movement required, Model9 can support it. Data can be moved without first having to select only files deemed relevant. All the data can be moved. And the further slicing and transformation into desired format can be accomplished in the cloud. You can enjoy all the benefits of mainframe data in the cloud, while still retaining the ability to refactor your mainframe applications only when you are ready, if at all
Model9 puts you in charge of your data and lets you put data gravity to work for your goals, allowing you to reshape your IT operations the way you want.
Mainframe modernization is a broad topic and one that elicits symptoms of anxiety in many IT professionals. Whether the goals are relatively modest, like simply updating part of the technology stack or offloading a minor function to the cloud, or an ambitious goal like a change of platform with some or all functions heading to the cloud, surveys show it is a risky business…
For example, according to the 2020 Mainframe Modernization Business Barometer Report, published by OneAdvanced.com, a UK software company, some 74 percent of surveyed organizations have started a modernization program but failed to complete it. This is in accord with similar studies highlighting the risks associated with ambitious change programs.
Perhaps that’s why mainframe-to-cloud migration is viewed with such caution. And, indeed, there are at least five reasons to be wary (but in each case, the right strategy can help!)
Top 5 reasons why mainframe to cloud migration initiatives fail
A focus on lift and shift of business logic
Lift and shift is easier said than done when it comes to mainframe workloads. Mainframe organizations that have good documentation and models can get some clarity regarding business logic and the actual supporting compute infrastructure. However, in practice, such information is usually inadequate. Even when the documentation and models are top notch, they can miss crucial dependencies or unrecognized processes. As a consequence, efforts to recreate capabilities in the cloud can yield some very unpleasant surprises when the switch is flipped. That’s why many organizations take a phased and planful approach, testing the waters one function at a time and building confidence in the process and certainty in the result. Indeed, some argue that the lift and shift approach is actually obsolete. One of the enablers for the more gradual approach is the ability to get mainframe data to the cloud when needed. This is a requirement for any ultimate switchover but if it can be made easy and routine it also allows for parallel operations, where cloud function can be set up and tested with real data, at scale, to make sure nothing is left to chance and that a function equal to or better than on-premises has been achieved.
Ignoring the need for hybrid cloud infrastructure
Organizations can be forgiven for wanting to believe they can achieve a 100 percent cloud-based enterprise. Certainly, there are some valid examples of organizations that have managed this task. However, for a variety of good, practical reasons, analysts question whether completely eliminating on-premises computing is either achievable or wise. A “Smarter with Gartner” article, Top 10 Cloud Myths, noted “The cloud may not benefit all workloads equally. Don’t be afraid to propose non cloud solutions when appropriate.” Sometimes there’s a resiliency argument in favor of retaining on-prem capabilities. Or, of course, there may be data residency or other requirements tilting the balance. The point is that mainframe cloud migration that isn’t conceived in hybrid terms is nothing less than a rash burning of one’s bridges. And a hybrid future, particularly when enabled by smooth and reliable data movement from mainframe to cloud, can deliver the best of both worlds in terms of performance and cost-effective spending.
Addressing technology infrastructure without accounting for a holistic MDM strategy
Defined by IBM as “a comprehensive process to drive better business insights by providing a single, trusted, 360-degree view into customer and product data across the enterprise,” master data management (MDM) is an important perspective to consider in any migration plan. After taking initial steps to move data or functions to the cloud, it quickly becomes apparent that having a comprehensive grasp of data, no matter where it is located, is vital. Indeed, a recent TDWI webinar dealt with exactly this topic, suggesting that multi-domain MDM can help “deliver information-rich, digitally transformed applications and cloud-based services.” So, without adaptable, cloud-savvy MDM, migrations can run into problems.
Assuming tape is the only way to back up mainframe data
Migration efforts that neglect to account for the mountains of data in legacy tape and VTL storage can be blindsided by how time consuming and difficult it can be to extract that data from the mainframe environment. This can throw a migration project off schedule or lead to business problems if backup patterns are interrupted or key data suddenly becomes less accessible. However, new technology makes extraction and movement much more feasible and the benefits of cloud data storage over tape in terms of automation, access, and simplicity are impressive.
Overlooking the value of historical data accumulated over decades
A cloud migration is, naturally, a very future-focused activity in which old infrastructure and old modes of working are put aside. In the process, organizations are sometimes tempted to leave some of their data archives out of the picture, either through simply shredding tapes no longer retained under a regulatory mandate or simply warehousing them. This is particularly true for older and generally less accessible elements. But for enterprises fighting to secure their future in a highly competitive world, gems of knowledge are waiting regarding every aspect of the business – from the performance and function of business units, the shop floor and workforce demographics to insights into market sectors and even consumer behavior. With cloud storage options, there are better fates for old data than gathering dust or a date with the shredder. Smart organizations recognize this fact and make a data migration strategy, the foundation for their infrastructure modernization efforts. The data hiding in the mainframe world, is truly an untapped resource that can now be exploited by cloud-based services.
Failure is not an option
Reviewing these five potential paths to failure in mainframe-cloud migration should not be misconstrued as an argument against cloud. Rather, it is intended to show the pitfalls to avoid. When considered carefully and planfully – and approached with the right tools and the right expectations – most organizations can find an appropriate path to the cloud.
Vendors are scrambling to deliver modern analytics to act on streams of real-time mainframe data. There are good reasons for attempting this activity, but they may actually be missing the point or at least missing a more tantalizing opportunity.
Real-time data in mainframes comes mostly from transaction processing. No doubt, spotting a sudden national spike in cash withdrawals from a bank’s ATM systems or an uptick in toilet paper sales in the retail world may have significance beyond the immediate “signal” to reinforce cash availability and reorder from a paper goods supplier. These are the kinds of things real-time apostles rave about when they tout the potential for running mainframe data streams through Hadoop engines and similar big data systems.
What’s missed, however, is the fact that mainframe systems have been quietly accumulating data points just like this for decades. And where mainframe data can be most valuable is in supporting analytics across the time axis. Looking at when similar demand spikes have happened over time and their duration and repetition offers the potential to predict them in the future and can hint at the optimal ways to respond and their broader meaning.
Furthermore, for most enterprises, a vast amount of real-time data exists outside the direct purview of mainframe: think about the oceans of IoT information coming from machinery and equipment, real-time sensor data in retail, and consumer data floating around in the device universe. Little of this usually gets to the mainframe. But it is this data, combined with mainframe data that is not real-time (but sometimes near-real-time), that may have the greatest potential as a font of analytic insight, according to a recent report.
To give mainframes the power to participate in this analytics bonanza requires some of the same nostrums being promoted by the “real-time” enthusiasts but above all requires greatly improving access to older mainframe data, typically resident on tape or VTL.
The optimal pattern here should be rescuing archival and non-real-time operational data from mainframe storage and sharing it with on-prem or cloud-based big data analytics in a data lake. This allows the mainframe to continue doing what it does best while providing a tabula rasa for analyzing the widest range and largest volume of data.
Technology today can leverage the too-often unused power of zIIP engines to facilitate data movement inside the mainframe and help it get to new platforms for analytics (ensuring necessary transformation to standard formats along the way).
It’s a way to make the best use of data and the best use of mainframe in its traditional role while ensuring the very best in state-of-the-art analytics. This is a far more profound opportunity than simply dipping into the flow or real-time data in the mainframe. It is based on a fuller appreciation of what data matters and how data can be used. And it is the path that mainframe modernizers will ultimately choose to follow.
Blame the genius that gave us the term “cloud” as shorthand for distributed computing. Clouds, in many languages and cultures, are equated with things ephemeral and states of mind that are dreamy or thoughts that are vague.
Well, cloud computing is none of those “cloud things.” It is the place where huge capital investments, the best thinking about reliability, and the leading developments in technology have come together to create a value proposition that is hard to ignore.
When it comes to reliability, as a distributed system – really a system of distributed systems – cloud accepts the inevitability of failure in individual system elements and in recompense, incorporates very high levels of resilience across the whole architecture.
For those counting nines (those reliability figures quoted as 99.xxx) there can be enormous comfort in the figures quoted by cloud providers. Those digging deeper, may find the picture to be less perfect in ways that make the trusty mainframe seem pretty wonderful. But the vast failover capabilities built into clouds, especially those operated by the so-called hyperscalers, is so immense as to be essentially unmatchable, especially when other factors are considered.
The relevance of this for mainframe operators is not about “pro or con.” Although some enterprises have taken the “all cloud” path, in general, few are suggesting the complete replacement of mainframe by cloud.
What is instead true, is that the cloud’s immense reliability – its ability to offer nearly turnkey capabilities in analytics and many other areas, and its essentially unlimited scalability – means it is the only really meaningful way to supplement mainframe core capabilities and in 2021 its growth is unabated.
Whether it is providing the ultimate RAID-like storage reliability across widely distributed physical systems to protect and preserve vital data or spinning up compute power to ponder big business (or tech) questions, cloud is simply unbeatable.
So, for mainframe operations, it is futile to try to “beat” cloud but highly fruitful to join – the mainframe + cloud combination is a winner.
Indeed, Gartner analyst Jeff Vogel, in a September 2020 report, “Cloud Storage Management Is Transforming Mainframe Data,” predicts that one-third of mainframe data (typically backup and archive) will reside in the cloud by 2025 — most likely a public cloud — compared to less than 5% at present – a stunning shift.
This change is coming. And it is great news for mainframe operators because it adds new capabilities and powers to what’s already amazing about mainframe. And it opens the doors to new options that have immense potential benefits for enterprises ready to take advantage of them.
Introducing object storage terminology and concepts – and how to leverage cost-effective cloud data management for mainframe
Object storage is coming to the mainframe. It’s the optimal platform for demanding backup, archive, DR, and big-data analytics operations, allowing mainframe data centers to leverage scalable, cost-effective cloud infrastructures.
For mainframe personnel, object storage is a new language to speak. It’s not complex, just a few new buzzwords to learn. This paper was written to introduce you to object storage, and to assist in learning the relevant terminology. Each term is compared to familiar mainframe concepts. Let’s go!
What is Object Storage?
Object storage is a computer data architecture in which data is stored in object form – as compared to DASD, file/NAS storage and block storage. Object storage is a cost-effective technology that makes data easily accessible for large-scale operations, such as backup, archive, DR, and big-data analytics and BI applications.
IT departments with mainframes can use object storage to modernize their mainframe ecosystems and reduce dependence on expensive, proprietary hardware, such as tape systems and VTLs.
Let’s take a look at some basic object storage terminology (and compare it to mainframe lingo):
- Objects. Object storage contains objects, which are also known as blobs. These are analogous to mainframe data sets.
- Buckets. A bucket is a container that hosts zero or more objects. In the mainframe realm, data sets are hosted on a volume – such as a tape or DASD device.
Data Sets vs. Objects – a Closer Look
As with data sets, objects contain both data and some basic metadata describing the object’s properties, such as creation date and object size. Here is a table with a detailed comparison between data set and object attributes:
The object attributes described below are presented as defined in AWS S3 storage systems.
Volumes vs. Buckets – a Closer Look
Buckets, which are analogous to mainframe volumes, are unlimited in size. Separate buckets are often deployed for security reasons, and not because of performance limitations. A bucket can be assigned a life cycle policy that includes automatic tiering, data protection, replication, and automatic at-rest encryption.
The bucket attributes described below are presented as defined in AWS S3 storage systems.
In the z/OS domain, a SAF user and password are required, as well as the necessary authorization level for the volume and data set. For example, users with ALTER access to a data set can perform any action – read/write/create/delete.
In object storage, users are defined in the storage system. Each user is granted access to specific buckets, prefixes, objects, and separate permissions are defined for each action, for example:
In addition, each user can be associated with a programmatic API key and API secret in order to access the bucket and the object storage via a TCP/IP-based API. When accessing data in the cloud, HTTPS is used to encrypt the in-transit stream. When accessing data on-premises, HTTP can be used to avoid encryption overhead. If required, the object storage platform can be configured to perform data-at-rest encryption.
Disaster Recovery Considerations
While traditional mainframe storage platforms such as tape and DASD rely on full storage replication, object storage supports both replication and erasure coding. Erasure coding provides significant savings in storage space, as the data can be spread over multiple geographical locations. For example, on AWS, data is automatically spread across a minimum of 3 geographical locations, thus providing multi-site redundancy and disaster recovery from anywhere in the world. Erasure-coded buckets can also be fully replicated to another region, as is practiced with traditional storage. Most object storage platforms support both synchronous and asynchronous replication.
Model9 – Connecting Object Storage to the Mainframe
Model9’s Cloud Data Manager for Mainframe is a software-only platform that leverages powerful, scalable cloud-based object storage capabilities for data centers that operate mainframes.
The platform runs on the mainframe’s zIIP processors, providing cost-efficient storage, backup, archive, and recovery functionalities with an easy-to-use interface that requires no object-storage knowledge or skills.
For mainframe shops that need to move data on or off the mainframe, whether to the cloud or to an alternative on-premises destination, FICON, the IBM mainstay for decades, is generally seen as the standard, and with good reason. When it was first introduced in 1998 it was a big step up from its predecessor ESCON that had been around since the early 1990s. Comparing the two was like comparing a firehose to a kitchen faucet.
FICON is fast, in part, because it runs over Fibre Channel in an IBM proprietary form defined by ANSI FC-SB-3 Single-Byte Command Code Sets-3 Mapping Protocol for Fibre Channel (FC) protocol. In that schema it is a FC layer 4 protocol. As a mainframe protocol it is used on IBM Systems Z to handle both DASD and tape I/O. It is also supported by other vendors of disk and tape storage and switches designed for the IBM environment.
Over time, IBM has increased speeds and added features such as High Performance FICON, without significantly enhancing the disk and tape protocols that traverse over it; meaning these limitations on data movement remain. For this reason, the popularity and a long-history of FICON does not make it the answer for every data movement challenge.
Stuck in the Past
One challenge, of particular concern today, is that mainframe secondary storage is still being written to tape via tape protocols, whether it is real physical tape or virtual tape emulating actual tape. With tape as a central technology, it implies dealing with tape mount protocols and tape management software to maintain where datasets reside on those miles of Mylar. The serial nature of tape and limitations of the original hardware required large datasets to often span multiple tape images.
Though virtual tapes written to DASD improved the speed of writes and recalls, the underlying protocol is still constrained by tape’s serialized protocols. This implies waiting for tape mounts and waiting for I/O cycles to complete before next data can be written. When reading back, the system must traverse through the tape image to find the specific dataset requested. In short, while traditional tape may have its virtues, speed – the 21st century speed of modern storage – is not among them. Even though tape and virtual tape is attached via FICON, the process of writing and recalling data relies on the underlying tape protocol for moving data, thus making FICON attached less-than-ideal for many modern use cases.
Faster and Better
But there is an alternative that doesn’t rely on tape or emulate tape because it does not have to.
Instead, software generates multiple streams of data from a source and pushes data over IBM Open Systems Adapter (OSA) cards using TCP/IP in an efficient and secure manner to an object storage device, either on premise or in the cloud. The Open Systems Adapter functions as a network controller that supports many networking transport protocols, making it a powerful helper for this efficient approach to data movement. Importantly, as an open standard, OSA is developing faster than FICON. For example, with the IBM z15 there is already a 25GbE OSA-Express7S card, while FICON is still at 16Gb with the FICON Express16 card.
While there is a belief common among many mainframe professionals that OSA cards are “not as good as FICON,” that is simply not true when the necessary steps are taken to optimize OSA throughput.
To achieve better overall performance, the data is captured well before tape handling, thus avoiding the overhead of tape management, tape mounts, etc. Rather than relying on serialized data movement, this approach breaks apart large datasets and sends them across the wire in simultaneous chunks, while also pushing multiple datasets at a time. Data can be compressed prior to leaving the mainframe and beginning its journey, reducing the amount of data that would otherwise be written. Dataset recalls and restores are also compressed and use multiple streams to ensure quick recovery of data from the cloud.
Having the ability to write multiple streams further increases throughput and reduces latency issues. In addition, compression on the mainframe side dramatically reduces the amount of data sent over the wire. If software is also designed to run on zIIP engines within the mainframe, data discovery and movement as well backup and recovery workloads will consume less billable MIPS and TCP/IP cycles also benefit.
This approach delivers mainframe data to cloud storage, including all dataset types and historical data, in a quick and efficient manner. And this approach can also transform mainframe data into standard open formats that can be ingested by BI and Analytics off of the mainframe itself, with a key difference. When data transformation occurs on the cloud side, no mainframe MIPS are used to transform the data. This allows for the quick and easy movement of complete datasets, tables, image copies, etc. to the cloud, then makes all data available to open applications by transforming the data on the object store.
A modern, software-based approach to data movement means there is no longer a need to go to your mainframe team to update the complex ETL process on the mainframe side.
To address the problem of hard-to-move mainframe data, this software-based approach provides the ability to readily move mainframe data and, if desired, readily transform it to common open formats. This data transformation is accomplished on the cloud side, after data movement is complete, which means no MF resources are required to transform the data.
- Dedicated software quickly discovers (or rediscovers) all data on the mainframe. Even with no prior documentation or insights, Model9 can rapidly assemble and map the data to be moved, expediting both modernization planning and data movement.
- Policies are defined to move either selected data sets or all data sets automatically, reducing oversight and management requirements dramatically as compared to other data movement methods.
- For the sake of simplicity, a software approach can be designed to invoke actions via a RESTful API, or a management UI, as well as from the Mainframe side via a traditional batch or command line,
- A software approach can also work with targets both on premises or in the cloud.
In summary, a wide-range of useful features can make data movement with a software-based approach intuitive and easy. By avoiding older FICON and tape protocols, a software-based approach can push mainframe data over TCP/IP to object storage in a secure and efficient manner, making it the answer to modern mainframe data movement challenges!