Next Webinar on Sept. 22: How Nedbank Connected Their Mainframe to the Cloud with Model9 & Microsoft Azure


Author: Gil Peleg

We did the impossible and the unlikely. Venture capital players are all about the newest and most innovative technologies and market opportunities. They rarely consider plays in mature markets – and mainframe technology has long been seen as a mature market; some would have even said a dying market.

We saw things differently and birthed a technology that gives mainframe organizations the power to engage cloud and to build on their institutional strength and proven mainframe technology while engaging with the most advanced cloud capabilities and ultra-agile business practices. 

Customers were thirsty for this kind of innovation and we knew we had to move fast if we were going to compete with incumbent giants such as IBM, EMC, Broadcom/CA, and Oracle. The answer had to be venture capital to give us the ability to scale quickly, achieve market penetration, and become the go-to solution for a global base of potential enterprise customers.

Venture capital has been as important as invention and scientific breakthroughs. It gave birth to most of the computer industry (other than IBM), and gave us innumerable consumer brands such as Waze, Mobileye, Uber, and even the emergence of autonomous cars. VCs are special people – many are veterans of startups that know about ramping up, know about market windows, and know that the ability to scale is a matter of life and death in tech markets.

It wasn’t easy for the Model9 founders. We could see eyes roll a bit when we explained that we were tackling the mainframe market. But once we got folks to understand it’s a huge opportunity and that we really have the power to unlock much of the world’s corporate data, the power to put data owners back in control and break a lock on data that sometimes spanned generations, and the power to bring cloud capabilities to the needs of the enterprise – they started to listen and listen closely.

And, like other’s that have won the nod from VCs, we demonstrated that we had the team, the experts and the drive, to take a modest startup and its innovative technology, and use it to transform the huge enterprise computing market.

We also had to clarify how we differed from a wide spectrum of VC-backed storage and data management companies that are transforming industries (WekaIO, VAST, Pure, Cohesity…), but none touch the mainframe, despite its great importance and central role in the enterprise market and world economics in general. Could it be, they wondered, that we were the first to crack open a market long monopolized by a few big players? Were we really doing something that unique?

Yes, we were – and they saw the light.

The lesson in all this may be that opportunities can be right in front of you and still be overlooked. The mainframe world can seem arcane, closed, and unfamiliar to most who have grown up in the PC/Server/Web/device era. But it remains an arena of action, absorbing billions of dollars in spending each year and contributing directly to trillions of dollars in economic activity.

And, surprisingly, it is a green field for innovation and VC investing. It has been a long time since this corner of IT was viewed seriously as a field for VC investment but it is finally being viewed not as IT “island” but, much more accurately, as a “supercontinent,” waiting to be fully integrated and optimized with the rest of our dynamic, information-rich and connected society.

For VCs this should spell opportunity, but many of them and nearly all of the players in the startup world, have grown up and profited entirely within the non-mainframe world. We got lots of those blank stares when we started out, until our persistence led us to Intel Capital, StageOne Ventures, North First Ventures and GlenRock, who grasped the importance of the mainframe market and saw our potential — and we’re grateful that they have.

We hope more venture firms will follow them to this land of opportunity.

Mainframe modernization is a top priority for forward-thinking I&O leaders who don’t want to be left behind by rapid technological change. But moving applications to the cloud has proven to be difficult and risky, which can lead to analysis-paralysis or initiatives that run over budget and fail. This is slowing the pace of adaptation, starving cloud functions of access to mainframe data, and often inhibiting any positive action at all.

So, at the moment, most mainframe data is effectively siloed, cut off from BI or AI cloud applications and data lakes in the cloud and simply locked in a “keep the lights on” mentality that is dangerous if continued too long.

Part of the problem is that mainframe organizations have focused on an application-first approach to cloud engagement which is usually the wrong approach because cost and complexity get in the way. Leaders should instead take a data-first approach that allows them to begin modernization by liberating mainframe data and moving storage and even their backup and recovery functions to the cloud. This has the benefit of making mainframe data immediately accessible in the cloud without requiring any mainframe application changes.

Why Is Mainframe Modernization So Hard?

The mainframe environment has been aptly described as a walled garden. Everything inside that garden works well, but the wall makes this Eden increasingly irrelevant to the rest of the world. And the isolation keeps the garden from reaching its full potential. 

The walled garden is a result of the inherent nature of mainframe technology, which has evolved apart from the cloud and other on-prem environments. This means the architecture is fundamentally different, making a so-called lift-and-shift move to the cloud very difficult. Applications built for mainframe must stay on the mainframe and adapting them to other environments is often prohibitive. At an even more fundamental level, mainframe data is stored in forms that are incomprehensible to other environments.

How does Model9 Cloud Data Manager for Mainframe Work?

While mainframe lift-and-shift strategies may be very challenging, the movement of data to the cloud has suddenly gotten much easier thanks to Model9 Cloud Data Manager for Mainframe, which represents a fresh technology direction.

Our patented and proven technology takes mainframe data and moves it quickly and easily to the cloud and can then transform it in the cloud to almost any industry-standard form. 

With Model9, mainframe data is first moved to an object storage target in a public cloud or a private cloud . The process is vendor agnostic and eliminates most of the traditional costs associated with mainframe ETL because it leverages zIIP engines to handle movement (over TCP/IP) and accomplishes the “transform” step in the cloud, without incurring MSU costs.

This can work with any mainframe data but is especially helpful for historic data and any data resident on tape or virtual tape, normally hard to access even for the mainframe itself.

The result is backup, archiving, and recovery options in the cloud that are cost-effective, faster, and easier to access than in traditional on-prem systems. And, Model9 has almost no impact on existing mainframe applications and operations. It is a data-first approach that allows you to transition mainframe data into the cloud with a software only solution

The Benefits Of Model9’s Data-first Approach

By focusing on the simpler task of moving mainframe data first, organizations gain multiple advantages including:

  • Cost Reduction by reducing or eliminating the tape or VTL hardware footprint and associated mainframe software (DFSMShsm etc.), as well as reducing MSU charges
  • Cloud can deliver a full data protection solution that can provide security and, “recover anywhere” capability.
  • Cloud-based transformation immediately unlocks mainframe data for use in cloud applications.
  • Cloud can also yield performance improvements such as reduced backup windows, reduced peak processing demand, and reduced system overhead.

Data First Works

Data-first mainframe modernization empowers leaders to broaden their cloud adoption strategy and secure more immediate benefits. It can accelerate cloud migration projects by leveraging non-mainframe skills and delivering simplified data movement. Organizations can readily maintain and access mainframe data after migrating to the cloud to meet corporate and regulatory requirements without the traditional high costs.

In addition, a data-first approach reduces the burden on your mainframe by offloading data to external platforms while empowering analytics and new applications in the cloud for some of your most valuable data.

According to Gartner, with new approaches to data and cloud, ‘Through 2024, 40% of enterprises will have reduced their storage and data protection costs by over 70% by implementing new storage technologies and deployment methods.’

Best of all, a data-driven approach allows organizations to combine the power of the mainframe with the scalability of the cloud and modernize on their own terms.

If you are still using a legacy VTL/Tape solution, you could be enjoying better performance by sending backup and archive copies of mainframe data directly to cloud object storage.

The reason for this is when you replace legacy technology with modern object storage, you can eliminate bottlenecks that throttle your performance. In other words, you can build a connection between your mainframe and your backup/archive target that can move data faster. You can think of this as “ingestion throughput.”

3 ways you can increase ingestion throughput for backup and archive copies of mainframe data

Here are the top three ways you can increase ingestion throughput:

#1: Write data in parallel, not serially

The legacy mainframe tapes used to make backup and archive copies required data to be written serially. This is because physical tape lived on reels, and you could only write to one place on the tape at a time. When VTL solutions virtualized tape, they carried over this sequential access limitation.

In contrast, object storage does not have this limitation and does not require data to be written serially. Instead, it is possible to use a new method to send multiple chunks of data simultaneously directly to object storage using TCP/IP. 

#2: Use zIIP engines instead of mainframe MIPS

Legacy mainframe backup and archive solutions use MSUs, taking away from the processing available to other tasks on the mainframe. This in effect means that your mainframe backups are tying up valuable mainframe computing power, reducing the overall performance you can achieve across all the tasks you perform there.

You do not need to use MSUs to perform backup and archive tasks. Instead, you can use the mainframe zIIP engines—reducing the CPU overhead and freeing up MSUs to be used for other things.

#3: Compress data before sending it

Legacy mainframe backup and archive solutions do not support compressing data before sending it to Tape/VTL. This means that the amount of data that needs to be sent is much larger than it could be using modern compression techniques.

Rather it is possible to compress your data before sending it to object storage. Not only do you benefit from smaller data transfer sizes, but you can increase the effective capacity of your existing connection between the mainframe and storage target. For example, compressing data at a 3:1 ratio would effectively turn a 1GB line into a 3GB line—allowing you to send the same amount of data faster while still using your existing infrastructure.

Faster than VTL: Increase Mainframe Data Management Performance 

Replacing your legacy VTL/Tape solution with a modern solution that can compress and move data to cloud-based object storage can significantly decrease the amount of time it takes to backup and archive your mainframe data, without increasing resource consumption.

Writing in parallel, leveraging zIIP engines, and employing compression is a low-risk, and high-reward option that leverages well-known, well-understood, and well-proven technologies to address a chronic mainframe challenge. This can yield immediate, concrete benefits such as reducing the amount of time it takes for you to backup and archive your mainframe data and cut costs while boosting capabilities.

Case study - Go faster than VTL with Model9

Cloud-first strategies propose an optimistic ideal in which on-prem IT is a thing of the past and any imaginable function can be bought or created in the cloud. To be sure, this vision is based on a reality: There are many successful organizations that have been “born in the cloud” and many others that have successfully moved most or all functionality there, as well.

But I&O leaders of mainframe-based organizations, though often subjected to relentless questioning regarding potential financial benefits from moving to the cloud, know that the nature of mainframe and the “gravity” of the data and applications on premises, make a move to the cloud challenging, at best. For them, ‘cloud first’ can seem to be nothing but a chimera.

However, it doesn’t have to be that way. Modern tools put the cloud within reach as never before; initially as an adjunct to mainframe and, over the long term, perhaps even as a replacement if such a move actually makes business sense (and, it often does not!).    

Cloud is best considered as part of a mainframe modernization effort. The very unique characteristics of mainframe and mainframe applications means that migrating them to the cloud is difficult, requiring refactoring and/or rearchitecting, which is time-consuming, expensive, and risky. So, approaches that strengthen the mainframe while engaging with cloud make sense.

Blocked by Siloed Mainframe Data

Reluctance to attempt actual application migration to the cloud is an acknowledgement that the default approach trends not towards cloud first, but instead a ‘Mainframe + Cloud’ strategy. But the result is mainframe data silos that limit business options and reduce the utility of the data.

Siloing your mainframe data has an immediate business impact. Your most valued data is excluded from some of the most important analytical tools available, in particular cloud-based data lakes that have become a key tool for enterprises striving for agility and insight.

That absence of data also significantly limits the potential ROI of any cloud adoption and integration strategy because cloud capabilities will be missing a critical portion of the universe of relevant data. And, ultimately, it leaves your company in a straightjacket, restricting the potential dynamism of your IT organizations.

Data-led Migrations are the Key

Fortunately, this problem has a solution. Rather than taking the old-school approach of attempting to migrate mainframe applications all at once or keeping your data siloed, there is a modern alternative. It is based on the prescient idea that data itself is the answer. Data gravity is the colloquial term for the insight that data has power wherever it is located and can be accessed. That’s true when the data is locked exclusively in the mainframe environment and it is also true if it can be moved to the cloud. Move the data, according to this insight, and functionality will naturally tend to follow.

Put another way, moving the data is what matters. Once the data is available in a different environment the organization will evolve ways to access and use that data – either by migrating applications or by choosing to adopt a cloud capability that can deliver the same results with less cost and trouble.

Model9 delivers the capability to move your data and empowers you to choose when and, equally important, how much. For example, you can start with archival data that is used infrequently in the mainframe but has potentially limitless value in an analytics context. By moving that data to the cloud, you can free storage capacity on-prem (potentially allowing you to eliminate tape and VTLs). The mainframe can still access the data when needed but analytic tools in the cloud may use it much more often.

With data gravity increasingly centered in the cloud, you are in charge. You can continue to support mainframe while gradually building new applications and functionality in the cloud. Or, the data is there if you eventually decide on a full lift and shift. 

No matter the scale of movement required, Model9 can support it. Data can be moved without first having to select only files deemed relevant. All the data can be moved. And the further slicing and transformation into desired format can be accomplished in the cloud. You can enjoy all the benefits of mainframe data in the cloud, while still retaining the ability to refactor your mainframe applications only when you are ready, if at all

Model9 puts you in charge of your data and lets you put data gravity to work for your goals, allowing you to reshape your IT operations the way you want.

On-Demand Webinar: Mainframe Migration with Model9 & AWS

Mainframe modernization is a broad topic and one that elicits symptoms of anxiety in many IT professionals. Whether the goals are relatively modest, like simply updating part of the technology stack or offloading a minor function to the cloud, or an ambitious goal like a change of platform with some or all functions heading to the cloud, surveys show it is a risky business…

For example, according to the 2020 Mainframe Modernization Business Barometer Report, published by, a UK software company, some 74 percent of surveyed organizations have started a modernization program but failed to complete it. This is in accord with similar studies   highlighting the risks associated with ambitious change programs.  

Perhaps that’s why mainframe-to-cloud migration is viewed with such caution. And, indeed, there are at least five reasons to be wary (but in each case, the right strategy can help!)

Top 5 reasons why mainframe to cloud migration initiatives fail

A focus on lift and shift of business logic

Lift and shift is easier said than done when it comes to mainframe workloads. Mainframe organizations that have good documentation and models can get some clarity regarding business logic and the actual supporting compute infrastructure.  However, in practice, such information is usually inadequate. Even when the documentation and models are top notch, they can miss crucial dependencies or unrecognized processes. As a consequence, efforts to recreate capabilities in the cloud can yield some very unpleasant surprises when the switch is flipped. That’s why many organizations take a phased and planful approach, testing the waters one function at a time and building confidence in the process and certainty in the result. Indeed, some argue that the lift and shift approach is actually obsolete. One of the enablers for the more  gradual approach is the ability to get mainframe data to the cloud when needed. This is a requirement for any ultimate switchover but if it can be made easy and routine it also allows for parallel operations, where cloud function can be set up and  tested with real data, at scale, to make sure nothing is left to chance and that a function equal to or better than on-premises has been achieved.

Ignoring the need for hybrid cloud infrastructure

Organizations can be forgiven for wanting to believe they can achieve a 100 percent cloud-based enterprise. Certainly, there are some valid examples of organizations that have managed this task. However, for a variety of good, practical reasons, analysts question whether completely eliminating on-premises computing is either achievable or wise. A “Smarter with Gartner” article, Top 10 Cloud Myths, noted “The cloud may not benefit all workloads equally. Don’t be afraid to propose non cloud solutions when appropriate.” Sometimes there’s a resiliency argument in favor of retaining on-prem capabilities. Or, of course, there may be data residency or other requirements tilting the balance. The point is that mainframe cloud migration that isn’t conceived in hybrid terms is nothing less than a rash burning of one’s bridges. And a hybrid future, particularly when enabled by smooth and reliable data movement from mainframe to cloud, can deliver the best of both worlds in terms of performance and cost-effective spending. 

Addressing technology infrastructure without accounting for a holistic MDM strategy

Defined by IBM as “a comprehensive process to drive better business insights by providing a single, trusted, 360-degree view into customer and product data across the enterprise,” master data management (MDM) is an important perspective to consider in any migration plan.  After taking initial steps to move data or functions to the cloud, it quickly becomes apparent that having a comprehensive grasp of data, no matter where it is located, is vital. Indeed, a recent TDWI webinar dealt with exactly this topic, suggesting that multi-domain MDM can help “deliver information-rich, digitally transformed applications and cloud-based services.” So, without adaptable, cloud-savvy MDM, migrations can run into problems.

Assuming tape is the only way to back up mainframe data

Migration efforts that neglect to account for the mountains of data in legacy tape and VTL storage can be blindsided by how time consuming and difficult it can be to extract that data from the mainframe environment. This can throw a migration project off schedule or lead to business problems if backup patterns are interrupted or key data suddenly becomes less accessible. However, new technology makes extraction and movement much more feasible and the benefits of cloud data storage over tape in terms of automation, access, and simplicity are impressive. 

Overlooking the value of historical data accumulated over decades

A cloud migration is, naturally, a very future-focused activity in which old infrastructure and old modes of working are put aside. In the process, organizations are sometimes tempted to leave some of their data archives out of the picture, either through simply shredding tapes no longer retained under a regulatory mandate or simply warehousing them. This is particularly true for older and generally less accessible elements. But for enterprises fighting to secure their future in a highly competitive world, gems of knowledge are waiting regarding every aspect of the business – from the performance and function of business units, the shop floor and workforce demographics to insights into market sectors and even consumer behavior. With cloud storage options, there are better fates for old data than gathering dust or a date with the shredder.  Smart organizations recognize this fact and make a data migration strategy, the foundation for their infrastructure modernization efforts. The data hiding in the mainframe world, is truly an untapped resource that can now be exploited by cloud-based services.

Failure is not an option       

Reviewing these five potential paths to failure in mainframe-cloud migration should not be misconstrued as an argument against cloud. Rather, it is intended to show the pitfalls to avoid.  When considered carefully and planfully – and approached with the right tools and the right expectations – most organizations can find an appropriate path to the cloud.

Get started with your mainframe modernization journey!

Vendors are scrambling to deliver modern analytics to act on streams of real-time mainframe data.  There are good reasons for attempting this activity, but they may actually be missing the point or at least missing a more tantalizing opportunity.  

Real-time data in mainframes comes mostly from transaction processing. No doubt, spotting a sudden national spike in cash withdrawals from a bank’s ATM systems or an uptick in toilet paper sales in the retail world may have significance beyond the immediate “signal” to reinforce cash availability and reorder from a paper goods supplier. These are the kinds of things real-time apostles rave about when they tout the potential for running mainframe data streams through Hadoop engines and similar big data systems.

What’s missed, however, is the fact that mainframe systems have been quietly accumulating data points just like this for decades. And where mainframe data can be most valuable is in supporting analytics across the time axis. Looking at when similar demand spikes have happened over time and their duration and repetition offers the potential to predict them in the future and can hint at the optimal ways to respond and their broader meaning.

Furthermore, for most enterprises, a vast amount of real-time data exists outside the direct purview of mainframe: think about the oceans of IoT information coming from machinery and equipment, real-time sensor data in retail, and consumer data floating around in the device universe. Little of this usually gets to the mainframe. But it is this data, combined with mainframe data that is not real-time (but sometimes near-real-time), that may have the greatest potential as a font of analytic insight, according to a recent report.

To give mainframes the power to participate in this analytics bonanza requires some of the same nostrums being promoted by the “real-time” enthusiasts but above all requires greatly improving access to older mainframe data, typically resident on tape or VTL.

The optimal pattern here should be rescuing archival and non-real-time operational data from mainframe storage and sharing it with on-prem or cloud-based big data analytics in a data lake.  This allows the mainframe to continue doing what it does best while providing a tabula rasa for analyzing the widest range and largest volume of data.

Technology today can leverage the too-often unused power of zIIP engines to facilitate data movement inside the mainframe and help it get to new platforms for analytics (ensuring necessary transformation to standard formats along the way).

It’s a way to make the best use of data and the best use of mainframe in its traditional role while ensuring the very best in state-of-the-art analytics.  This is a far more profound opportunity than simply dipping into the flow or real-time data in the mainframe. It is based on a fuller appreciation of what data matters and how data can be used. And it is the path that mainframe modernizers will ultimately choose to follow.

Get started with your mainframe modernization journey!

Blame the genius that gave us the term “cloud” as shorthand for distributed computing. Clouds, in many languages and cultures, are equated with things ephemeral and states of mind that are dreamy or thoughts that are vague.

Well, cloud computing is none of those “cloud things.”  It is the place where huge capital investments, the best thinking about reliability, and the leading developments in technology have come together to create a value proposition that is hard to ignore.

When it comes to reliability, as a distributed system – really a system of distributed systems – cloud accepts the inevitability of failure in individual system elements and in recompense, incorporates very high levels of resilience across the whole architecture.

For those counting nines (those reliability figures quoted as there can be enormous comfort in the figures quoted by cloud providers. Those digging deeper, may find the picture to be less perfect in ways that make the trusty mainframe seem pretty wonderful. But the vast failover capabilities built into clouds, especially those operated by the so-called hyperscalers, is so immense as to be essentially unmatchable, especially when other factors are considered.  

The relevance of this for mainframe operators is not about “pro or con.”  Although some enterprises have taken the “all cloud” path, in general, few are suggesting the complete replacement of mainframe by cloud.

What is instead true, is that the cloud’s immense reliability – its ability to offer nearly turnkey capabilities in analytics and many other areas, and its essentially unlimited scalability – means it is the only really meaningful way to supplement mainframe core capabilities and in 2021 its growth is unabated.

Whether it is providing the ultimate RAID-like storage reliability across widely distributed physical systems to protect and preserve vital data or spinning up compute power to ponder big business (or tech) questions, cloud is simply unbeatable.

So, for mainframe operations, it is futile to try to “beat” cloud but highly fruitful to join – the mainframe + cloud combination is a winner.

Indeed, Gartner analyst Jeff Vogel, in a September 2020 report, “Cloud Storage Management Is Transforming Mainframe Data,” predicts that one-third of mainframe data (typically backup and archive) will reside in the cloud by 2025 — most likely a public cloud — compared to less than 5% at present – a stunning shift.

This change is coming. And it is great news for mainframe operators because it adds new capabilities and powers to what’s already amazing about mainframe. And it opens the doors to new options that have immense potential benefits for enterprises ready to take advantage of them.

Get started with your mainframe modernization journey!

Change is good – a familiar mantra, but one not always easy to practice. When it comes to moving toward a new way of handling data, mainframe organizations, which have earned their keep by delivering the IT equivalent of corporate-wide insurance policies (rugged, reliable, and risk-averse), naturally look with caution on new concepts like ELT — extract, load and transform.

Positioned as a lighter and faster alternative to more traditional data handling procedures such as ETL, (extract, transform and load), ELT definitely invites scrutiny. And that scrutiny can be worthwhile.

Definitions provided by say that ELT is “a data integration process for transferring raw data from a source server to a data system (such as a data warehouse or data lake) on a target server and then preparing the information for downstream uses.”  In contrast, another source defines ETL as “three database functions that are combined into one tool to pull data out of one database and place it into another database.”

Model9 | Diagram | ETL vs. ELT

The crucial functional difference in those definitions is the exclusive focus on database-to-database transfer with ETL, while ELT is open-ended and flexible. To be sure, there are variations in ETL and ELT that might not fit those definitions but the point is that in the mainframe world ETL is a tool with a more limited focus, while ELT is focused on jump-starting the future.

While each approach has its advantages and disadvantages, let’s take a look as to why we think ETL is all wrong for mainframe data migration.

ETL is Too Complex  

ETL was not originally designed to handle all the tasks it is now being asked to do. In the early days it was often applied to pull data from one relational structure and get it to fit in a different relational structure. This often included cleansing the data, too. For example, a traditional RDBMS can get befuddled by numeric data where it is expecting alpha data or by the presence of obsolete address abbreviations. So, ETL is optimized for that kind of painstaking, field-by-field data checking, `cleaning,’ and data movement, not so much for feeding a hungry Hadoop database or modern data lake. In short, ETL wasn’t invented to take advantage of all the ways data originates and all the ways it can be used in the 21st century.

ETL is Labor Intensive 

All that RDBMS-to-RDBMS movement takes supervision and even scripting. Skilled DBAs are in demand and may not last at your organization.  So, keeping the human part of the equation going can be tricky. In many cases, someone will have to come along and recreate their hand-coding or replace it whenever something new is needed. 

ETL is a Bottleneck 

Because the ETL process is built around transformation, everything is dependent on the timely completion of that transformation.  However, with larger amounts of data in play (think, Big Data), this can make the needed transformation times inconvenient or impractical, turning ETL into a potential functional and computational bottleneck.

ETL Demands Structure 

ETL is not really designed for unstructured data and can add complexity rather than value when asked to deal with such data. It is best for traditional databases but does not help much with the huge waves of unstructured data that companies need to process today.

ETL Has High Processing Costs 

ETL can be especially challenging for mainframes because they generally incur MSU processing charges and can burden systems when they need to be handling real-time challenges.  This stands in contrast to ELT which can be accomplished using mostly the capabilities of built-in zIIP engines, which cuts MSU costs, with additional processing conducted in a chosen cloud destination. In response to those high costs, some customers have taken the Transformation stage into the cloud to handle all kinds of data transformations, integrations, and preparations to support analytics and the creation of data lakes.

Moving Forward

It would obviously be wrong to oversimplify a decision regarding the implementation of ETL or ELT, there are too many moving parts and too many decision points to weigh. However, what is crucial is understanding that rather than being focused on legacy practices and limitations, ELT speaks to most of the evolving IT paradigms. ELT is ideal for moving massive amounts of data. Typically the desired destination is the cloud and often a data lake, built to ingest just about any and all available data so that modern analytics can get to work. That is why ELT today is growing and why it is making inroads specifically in the mainframe environment. In particular, it represents perhaps the best way to accelerate the movement of data to the cloud and to do so at scale. That’s why ELT is emerging as a key tool for IT organizations aiming at modernization and at maximizing the value of their existing investments.

Webinar: Add MF data sets to data analytics w/ Model9 & AWS

One of the great revelations for those considering new or expanded cloud adoption is the cost factor – especially with regard to storage. The received wisdom has long been that nothing beats the low cost of tape for long-term and mass storage.

In fact, though tape is still cheap, cloud options are getting very close such as with Amazon S3 Glacier Deep Archive, and offer tremendous advantages that tape can’t match. A case in point is Amazon S3 Intelligent-Tiering. 

Tiering (also called hierarchical storage management or HSM) is not new. It’s been part of the mainframe world for a long time, but with limits imposed by the nature of the storage devices involved and the software. According to Amazon, Intelligent Tiering helps to reduce storage costs by up to 95 percent and now supports automatic data archiving. It’s a great way to modernize your mainframe environment by simply moving data to the cloud, even if you are not planning to migrate your mainframe to AWS entirely.

How does Intelligent-Tiering work? The idea is pretty simple. When objects are found to have been rarely accessed over long periods of time, they are automatically targeted for movement to less expensive storage tiers.

Migrate Mainframe to AWS

In the past (both in mainframes and in the cloud) you had to define a specific policy stating what needed to be moved to which tier and when, for example after 30 days or 60 days. The point with the new AWS tiering is that it automatically identifies what needs to be moved, when, and then moves it at the proper time. To migrate mainframe to Amazon S3 is no problem because modern data movement technology now allows you to move both historical and active data directly from tape or virtual tape to Amazon S3. Once there, auto-tiering can transparently move cold and long-term data to less expensive tiers.

This saves the trouble of needing to specifically define the rules. By abstracting the cost issue, AWS simplifies tiering and optimizes the cost without impacting the applications that read and write the data. Those applications can continue to operate under their usual protocols while AWS takes care of selecting the optimal storage tier. According to AWS, this is the first and, at the moment, the only cloud storage that delivers this capability automatically.

When reading from tape, the traditional lower tier for mainframe environments, recall times are the concern as the system has to deal with tape mount and search protocols. In contrast, Amazon S3 Intelligent-Tiering can provide a low millisecond latency as well as high throughput whether you are calling for data in the Frequent or Infrequent access tiers. In fact, Intelligent-Tiering can also automatically migrate the most infrequently used data to Glacier, the durable and extremely low-cost S3 storage class for data archiving and long-term backup. And with new technology allowing efficient and secure data movement over TCP/IP, getting mainframe data to S3 is even easier.

The potential impact on mainframe data practices

For mainframe-based organizations this high-fidelity tiering option could be an appealing choice compared with tape from both a cost and benefits perspective. However, the tape comparison is rarely that simple. For example, depending on the amount of data involved and the specific backup and/or archiving practices, any given petabyte of data needing to be protected may have to be copied and retained two or more times, which immediately makes tape seem a bit less competitive. Add overhead costs, personnel, etc., and the “traditional” economics may begin to seem even less appealing.

Tiering, in a mainframe context, is often as much about speed of access as anything else. So, in the tape world, big decisions have to be made constantly about what can be relegated to lower tiers and whether the often much-longer access times will become a problem after that decision has been made. But getting mainframe data to S3, where such concerns are no longer an issue, is now easy. Modern data movement technology means you can move your mainframe data in mainframe format directly to object storage in the cloud so it is available for restore directly from AWS.

Many mainframe organizations have years, even decades of data on tape. The management of this tape data is retained only in the tape management system. Or perhaps it was just copied forward from a prior tape system upgrade.  How much of this data is really needed? Is it even usable anymore? To migrate mainframe to AWS, specifically this older data, allows management of the data in a modern way and can reduce the amount of tape data on-premises.

And what about those tapes that today are shipped off-site for storage and recovery purposes? Why not put that data on cloud storage for recovery anywhere? 

For mainframe organizations interested in removing on-premise tape technology, reducing tape storage sizes, or creating remote backup copies, cloud options like Amazon S3 Intelligent Tiering can offer cost optimization that is better “tuned” to an organization’s real needs than anything devised manually or implemented on-premises. Furthermore, with this cloud-based approach, there is no longer any need to know your data patterns or think about tiering, it just gets done.

Best of all, you can now perform a stand-alone restore directly from cloud. This is especially valuable with ransomware attacks on the rise because there is no dependency on a potentially compromised system.

You can even take advantage of AWS immutable copies and versioning capabilities to further protect your mainframe data.

Getting there

Of course, in order to take advantage of cloud storage like Amazon S3 Intelligent Tiering, you need to find a way to get your mainframe data out of its on-premises environment. Traditionally, that has presented a big challenge. But, as with multiplying storage options, the choices in data movement technology are also improving. For a review of new movement options, take a look at a discussion of techniques and technologies for Mainframe to Cloud Migration.

We recently looked at the topic of “Why Mainframe Data Management is Crucial for BI and Analytics” in an Analytics Insight article written by our CEO, Gil Peleg. Our conclusions, in brief, are that enterprises are missing opportunities when they allow mainframe data to stay siloed. And, while that might have been acceptable in the past, today data and analytics are very critical to achieving business advantage.

How did we get here? Mainframes are the rock on which many businesses built their IT infrastructure. However, while the rest of IT has galloped toward shared industry standards and even open architectures, mainframe has stood aloof and unmoved. It operates largely within a framework of proprietary hardware and software that does not readily share data. But with the revolutionary pace of change, especially in the cloud, old notions of scale and cost have been cast aside. As big and as powerful as mainframe systems are, there are things the cloud can now do better, and analytics is one of those things.

In the cloud no problem is too big. Effectively unlimited scale is available if needed and a whole host of analytic tools like Kibana, Splunk and Snowflake, have emerged to better examine not only structured data but also unstructured data, which abounds in mainframes.

Cloud tools have proven their worth on “new” data, yielding extremely important insights. But those insights could be enhanced, often dramatically, if mainframe data, historic and current, were made available in the same way or, better yet combined – for instance in modern cloud-based data lakes.

Eliminating silos

It turns out that most organizations have had a good excuse for not liberating their data: It has been a difficult and expensive task. For example, mainframe data movement, typically described as “extract, transform, and load” (ETL), requires intensive use of mainframe computing power. This can interfere with other mission-critical activities such as transaction processing, backup, and other regularly scheduled batch jobs. Moreover, mainframe software vendors typically charge in “MSUs” which roughly correlate with CPU processing loads. 

This is not a matter of “pie in the sky” thinking. Technology is available now to address and reform this process. Now, mainframe data can be exported, loaded, and transformed to any standard format in a cloud target. There, it can be analyzed using any of a number of tools. And this can be done as often as needed. What is different about this ELT process is the fact that it is no longer so dependent on the mainframe. It sharply reduces MSU charges by accomplishing most of the work on built-in zIIP engines, which are a key mainframe component and have considerable processing power. 

What does all this mean? It means data silos can be largely a thing of the past. It means an organization can finally get at all its data and can monetize that data. It means opening the door to new business insights, new business ideas, and new business applications.

An incidental impact is that there can be big cost savings in keeping data in the cloud in storage resources that are inherently flexible (data can move from deep archive to highly accessible quickly) rather than on-premises. And, of course, no capital costs – all operational expenses. Above all, though, this provides freedom. No more long contracts, mandatory upgrades, services, staff, etc. In short, it’s a much more modern way of looking at mainframe storage.

Webinar: Add MF data sets to data analytics w/ Model9 & AWS
Register for a Demo