Mark Jobbins, Vice President of Technical Services, Asia Pacific & Japan, Pure Storage
The backup industry has been around for as long as we’ve had computers and as the world accumulates more and more data, and as organizations recognize the power of that data to deliver key services, so has the need to better protect that data, especially for large organizations. There are of course many ways to skin this cat and a multitude of methods exist today to help companies backup and restore their data.
One architecture that many enterprises have adopted is disk-to-disk-to-tape (D2D2T). The first disk in this scenario runs production workloads while the second disk acts as the purpose-built backup appliance (PBBA). For further protection, enterprises often deploy offsite facilities where the same disk-based PBBAs are replicated. Older data is then stored and locked away on tape for long term retention. This is the D2D2T architecture.
D2D2T works great for backing up data. It is fast when data is backed up, deduplicating data for efficient usage of storage. But there are three core problems with this architecture.
The first arises when it’s time to restore data. This blog post by an independent consulting firm is indicative of the frustrations faced by customers trying to restore from disk. They estimated a 6TB database would take 25 days to fully restore but this is only an estimation as they killed the job after 24 hours when only 2 percent of the restore was completed! When IT misses SLAs, it puts business at risk.
The second problem is just a tremendous waste in this architecture. If there are multiple backup systems in the production site, then there needs to be multiple backup systems in the offsite data center as well.
The third problem is that once data goes on tape, data goes dark. The data is out of reach and offers no value. If data is truly the most important asset in an enterprise, this approach goes completely counter to it.
The future of backup and data protection
Pure Storage shook up the storage industry almost a decade ago by bringing flash technology to enterprise data storage environments and setting new ROI and performance benchmarks for data centers. But flash was always viewed as being too expensive for backup and recovery. However, in the last few years the basic economics of flash has changed and is constantly being further challenged with new NAND technologies. Coupled with the trend of enterprise datasets growing from terabytes to petabytes and beyond, and the need to more efficiently protect this data, these have swung the balance towards the use of flash for data protection.
The second major factor that is causing enterprises to rethink their data protection strategies is the increasing demand to be able to access their data not just to stay competitive but also to meet increasing stringent laws on data reuse.
We are undoubtedly in an era where data creates value and tomorrow’s industry leaders will build practices around innovating on data. Those who do not will naturally fall behind.
AI is the ultimate workload that is fueled by data, massive amounts of data. It’s one of the only ways to improve the accuracy of AI models. One customer, in pursuit of curing cancer with AI-powered pathology, is using decades-old imaging data to train their models. Decades worth of cell images, all now being used to fight cancer.
On the legislative front, policies such as GDPR in Europe is placing tremendous pressure on enterprises to be able to search through all their data. In today’s data-first world, more and more is being asked of backup data.
Why Flash is Replacing Disk for Backup and Recovery
Flash offers orders of magnitude performance increases over spinning magnetic disk. High performance flash backups and restores can be used to match the speed of all-flash production systems restoring typically as fast as the production systems can consume the data. And, flash backups can also be used to enable more simultaneous server backups providing better utilization at scale. By coupling flash with data reduction, we can get greater economics and great restore performance. If you are suffering from missed backup windows or restore SLAs, the solution is flash-to-flash.
Pure Storage FlashBlade for example delivers peak backup performance of 90 TB/hr and restore performance that is 3x higher than disk at 75 GB/s, and this is in just 20 rack units.
FlashArray snapshots are not only held locally the FlashArray for almost instant recovery, they can also be directly redirected to other destinations such as FlashBlade. This is called a Portable Snapshot and can used for rapid recovery to multiple destinations providing both data protection and portability.
Rapid Restore with FlashBlade
After discovering how and why our customers were using FlashBlade for data protection, we created Rapid Restore solutions to support all of the key databases, as well as solutions with both the traditional and newer data protection vendors. A single FlashBlade can support 15TB/hr of backup rate and almost 50TB/hr of restore rate. With typical data reduction rates of 3:1 on an Oracle RMAN backup to FlashBlade, DBAs can complete their database restores in minutes and hours instead of days.
Great as this is however, there was still a missing piece of the puzzle.
Flash-to-Flash-to-Cloud for a Complete Modern Solution
Even though our customers were effectively using FlashBlade to deliver Rapid Restore capabilities, they still needed to store large amounts of data offsite for retention and compliance - which they had continued to do with tape. As previously mentioned, tape is complex and slow, but the real failure of tape is that your data is locked offline somewhere – providing no “real time” value for your company.
The answer is to finally replace tape with low-cost cloud object storage, like Amazon S3.
With Pure Storage ObjectEngine, we can finally offer a modern solution for backup - Flash-to-Flash-to-Cloud - and the final piece to the puzzle.
The ObjectEngine platform enables inline data deduplication and encryption to the cloud whilst also reducing storage and data transmission costs by up to 97%. Coupled with a FlashBlade scale-out storage system, recovery time objectives (RTOs) can be further reduced, enabling rapid recovery of data in the event of a disaster.
Warm Up Your Cold Data
If you can get your backup data to the cloud, then you can start to think about how you re-use your data for migration, dev/test, analytics, etc. Since ObjectEngine Cloud is built to run in the cloud, you can leverage the cloud edition of your backup software to recover wholly within the cloud. You will also be able to, in the future, restore your data to Pure’s Cloud Block Store or to your favorite Amazon data service. Turning what used to be cold data that sat on a tape inside a vault someplace into business value leveraging a wide variety of web services.