Author: Danielle Zubery
For decades, mainframe data tasks have regularly included ETL – extract, transform and load – as a key step on the road to insights. Indeed, ETL has been the standard process for copying data from any given source into a destination application or system. ETL got a lot of visibility with the rise in data warehouse operations but was often a bottleneck in those same data warehouse projects.
Today, ETL is still the default choice for data movement, especially in the mainframe. But there is a legitimate alternative – ELT – extract, load, and transform.
As the reshuffling of terms implies, ELT takes a much different approach, first extracting data from wherever it currently resides and then loading it, generally to a target outside of the mainframe. It is there, wherever that “there” is, that the hard work of transform happens, typically as a prelude to the application of analytics.
So, ELT is an acronym, but one that’s pretty revolutionary.
Why? By reframing the idea of ETL with the technologies of today, the entire process has the potential to be faster, easier, and less expensive because it can use the most appropriate and cost-effective resources. Not just the mainframe CPU.
ELT tends to require less maintenance than ETL, which typically has many requirements for manual, ad hoc intervention and management. In contrast, ELT is based on automated, cloud-based processing. Similarly, ELT loads more quickly, since transformation is closely linked to the ultimate cloud-based analysis work. ELT, then, is primarily concerned with getting data from mainframe to the cloud. Finally, of course, it is usually faster overall. And, because it depends primarily on pay-as-you-go cloud resources rather than on the billing structure of the mainframe, it is generally less expensive.
ELT empowers the routine and regular movement of mainframe operational and archived data from expensive and slow tape and VTL to storage environments that are both fast and highly cost-effective, such as AWS S3 Tiered Storage. ELT can also deliver data directly for transformation to standard formats in the cloud – and then make that data available to data lakes and other modern BI and analytics tools. Because ELT retains its original format and structure, the options for how the data can be used (transformed) in the cloud are practically unlimited.
The key to ELT on the mainframe is, of course, zIIP engines, the helpful processing capability provided by IBM for handling exactly this kind of `non-critical’ activity. It’s just that no one tried before.
With zIIP help and TCP/IP to assist in movement, buried data sets can be liberated from mainframe data silos and deliver real monetary value. What’s more, companies that have tried ELT have discovered how easy it is to move mainframe data. They can more easily take advantage of cloud storage economics –potentially eliminating bulky and expensive tape and VTL assets.For these many good reasons, ELT is `NJAA,’ not just another acronym – it’s an acronym worth getting to know.
In modern analytics, significant value can be gained from insights that are based on multiple data sources. That’s the power of the data lake concept. But for most larger organizations, unaware that there are easy data movement options, data lakes still exist far from the organization’s most important and often largest data collection—the data in mainframe storage.
Whether this data is already at home in a mainframe data warehouse or scattered in multiple databases and information stores, its absence from the data lakes is a tremendous problem.
In fact, in an Information Age article, Data Storage & Data Lakes, editor Nick Ismail noted, “If an organization excludes difficult to access legacy data stores, such as mainframes, as they build the data lake, there is a big missed opportunity.”
Recognizing this growing business challenge, Model9, a company founded by mainframe experts and cloud evangelists, created unique, patented technology that can move mainframe data and transform it to and from standard industry formats and between the cloud and mainframe. Specifically, Model9 eliminates the traditional ETL process, which is expensive in terms of time, money and CPU cycles, and delivers richer outcomes with fewer resources.
In other words, Model9 helps get mainframe data into the game.
Unlike traditional brute force methods of moving data to other platforms, requiring heavy use of mainframe processing power, Model9 does most of the work outside of the mainframe, a process of extract, load, and transform (ELT) rather than extract, transform, and load (ETL). It is fast and economical.
Is it really that easy? Yes. The Model9 architecture includes a zIIP-eligible agent running on z/OS and a management server running in a Docker container on Linux, z/Linux, or zCX. The agent does the job of reading and writing mainframe data from DASD or tape directly to cloud storage over TCP/IP using DFDSS as the underlying data mover. Other standard z/OS data management services are also used by the agent, such as system catalog integration, SMS policy compliance, and RACF authorization controls. Compression and encryption are performed either using zEDC and CryptoExpress cards if available, or using zIIP engines.
Although the world of DB2 tools on mainframe has made a lot of progress in integration with other SQL databases by using CDC technology, this remains an expensive approach and one that does not scale optimally. In contrast, the Model9 Image Copy transformation offering is an industry-first lift & shift solution for DB2 on mainframes.
Additionally, Model9 offers migration capabilities for unstructured mainframe data types such as VSAM, PS, and PO as well as support for COBOL copy books, delivering end-to-end process automation.
Presto, organizations can now easily share mainframe data with analytic tools and incorporate it into data lakes, potentially yielding powerful insights. Model9 offers data lakes a chance to reach their fullest potential and makes mainframe pros into business partners!