Unsolved
This post is more than 5 years old
6 Operator
•
14.4K Posts
•
56.2K Points
0
1012
July 3rd, 2011 14:00
Thoughts about backup after EMC World 2011
Let me start by saying that this was my second EMC World so far. My first one happened in 2010 in Boston and I was quite happy with it. It was first time to be in US actually too. As far as conference went, it was big and I kept nice memories. At that time I might have been focused more on getting to know people you hear every day about and less on lectures, but it was first time and I was satisfied. This year, in May, we had what is my second EMC World being held in Las Vegas. It really is city which never sleeps. While I didn't move my nose out of strip area, it was great fun. As far as whole conference goes it was bigger and better. Not that something was wrong with 2010 one, but it felt more dynamic, more people and - at least to me - more exciting news and new knowledge. I was blown away by Isilon. I will be the first one to admit that I didn't pay much attention to storage before, but the feeling is that many others also missed this name so far (at least ppl in Europe I have spoken to, but you could see that at conference too). I had also hands on with RecoverPoint, DataDomain Archiver, VPLEX Geo, learn a bit more about Greenplum, attended two BoF sessions (NetWorker and Symmetrix), got to know loads of new people and saw some old friends, had few beers and loads of fun. Keynotes were great with the one by Pat Gelsinger being favorite Oh, I saw finally Blue Men Group too. Great show - worth every cent! NetWorker, which tends to me primary focus, was represented in sessions with DD Archiver and then new NMM (and BoF session I mentioned earlier). And while all segments of EMC World were exciting (and more excited than previous year), this feeling was left our for BURA section. So I wish to right down my reasoning and perhaps have some other people on board to join with their thoughts. Obviously this is going to be technology wise talk.
In 2010, EMC World borough to us cloud. Cloud at that time was already hot topic. At that time many of us were still unsure whether this is real thing or just wishful thinking, but things are more clear now. Nevertheless, one thing that I mentioned back then was that cloud itself generates big data which imposes real challenge for backup. I noticed that in snapshots arena EMC has integration with NetWorker (being backup flagship), but so much could be done better and I simply do not see much of investment in that area. Second thing, more warring for future of the cloud, was the fact that you really need some fast infrastructure there and current market offering was a bit away from that. Then this year we saw Brocade (just before EMC World) coming with 16Gbps FC, SSD boom is happening, 40Gbps Ethernet is coming... so things are coming to their place... all the pieces of puzzle to build new data infrastructure are becoming available. Not only that. Years ago, when working with VmWare for the first time, I said this would move to hardware soon as hardware is always more efficient. You will have rack, you load hot swappable draw with disk inside, upon first run you answer few simple questions like how many IPs and where, names, gateways, DNS, how much memory and CPU and is there. Information about OS itself (baseline) is contained within big array firmware and copied over to disk where this "virtual" machine is built. Later on, you can remove it and place it to another appliance or simply build mirror cluster over some fast lines. It is not hard to imagine such future at all, with few things different of course. And this year, upon closing of EMC World, EMC made announcement of something in works (can't remember that being a practice so far) - Project Lightning. It is a way different from what I thought few years ago, but at the end of the day it is the same direction.
This year's EMC World is seen as continuation of cloud story with accent on Big Data. Reason for this is analyst expectation we would hit some significant increase in information volume which happens as we speak. EMC showed they have answer for big data challenge in areas of storage (VMAX and Isilon) and analytics (Greenplum). Make it cloud ready via VPLEX Geo (I keep assuming two locations for data centers even that might not be your case). Things go smooths with virtualization as there is VmWare. So, things look good there. What about backup?
I happened to have big data and I happen to manage it with snapshots (pretty much as everyone else). I'm lucky to have same vendor behind backup application (NetWorker) and disk array (Symmetrix). Running business critical databases (Oracle) and applications (SAP) with data hosted on Symmetrix we need to provide 24h protection and disaster recovery. Due to size, snapshots is great solution for us. Because we use one solution which enables us to use integrate with database/application (online modules for Oracle and SAP) and disk arrays (PowerSnap) we built system where we have daily protection at level which is acceptable. As much as it sounds nice, there some quirks in a way how PowerSnap is designed that running this with scripts or with faster network (depending on your design of course) can make it more efficient. This is definitely not something when you have big data. Further, in old days PowerSnap used to be available for different disk array vendors. Not these days. While I could justify perhaps market conditions for some of them, I can also say that things in Gartner's magic quadrant does not change that often. Having NetWorker support for snapshots on those platform (eg. PowerSnap module for those) would be great benefit. But that never happened nor did I see any enthusiasm on EMC side to do so. Pity, as I expect enterprise backup solution to cover wide range of solutions to be found in enterprise environments. Being able to do so would help EMC to gain new markets as well.
Earlier this year, somewhere in January, Data Domain archiver has been presented. It is nice day and I think it may have future. There were few things I didn't like in a way how mtrees are being handled, but this is known and engineering already had on its mind to change that in future releases. What surprised me though is that fact that such solution, which primarily would be used to archive data, didn't use spin down technology. There are markets where cost of energy is taken more seriously than others and having those disks spinning is not something everyone is thrilled about. Not sure why this has been missed in first release, but hopefully it will be changed. Further, we still have ongoing battle between techies and management on what is backup and what is archive and usually they all settle down by having backup, long-term backup and archives. In my experience based on customer base I worked with, Data Domain is perfect fit for so called long-term backup. But in all cases where archives are done with backup software (or archiving software), there has been a request to remove those from data center premises. If you need to remove your data archive storage out, will this appliance help? I think it would be nice to come up with concept similar to tapes, where you have stackable disks which you can take out as you would tapes and keep them wherever you want. If you do not wish to do so, you just keep them inside appliance. Appliance should keep metadata so that at any time you wish to bring back the disk at any position data is accessible. I was hoping more to see something like that. In that case you would eliminate tape from equation for good.
There are few more things bothering me about Data Domain. I miss ability to have pool of disks I can use to build device without dedupe and/or compression. Some data simply does not fit those. What I see at the moment is everyone using compression or dedupe and when you combine this with backup you get strange things. If database is using compression then you do not get it right on DD. If storage is using compression or dedupe then you get load on it during backup. At the moment I see more benefits on application side than on primary storage, but in future with CPU advances then is likely to be changed (of course, we might see application appliances kicking in like SAP HANA to make whole picture more complicated). Archive logs won't dedupe well so DD storage might be too expensive as such for them. Bottom line is, DD is not answer to everything, but with a bit of engineering one can make it so. (adding ability to mix VTL with DD boosts devices)
Another thing I was expecting from EMC as hardware company was to build backup appliance pretty much as NetApp has - with snapmirror and snapvault. If that single appliance could be used with Symmetrix and VNX line in such manner that would be great. In such way you could leave PowerSnap things out of equation and have efficient solution. Next step would be to make sure such appliance would work with other storage solutions too. I do remember few years ago one EMC techie talking about SMI protocol and idea to have single standard for snapshots so that one module can integrate with no matter what snapshot and storage solution customer would use. I was a bit skeptical as I remember how NDMP protocol was painful to build (and despite it vendors kept doing their little things). It is nice idea, not sure how possible though. It was nice to see EU pushing one charger solution for mobile phones, but I have no idea if some authority could push storage vendors towards unified protocol to make backups easier.
One thing which I'm worried and keeps me worried, especially when it comes to big data, is - number of sessions. Storage wise, big data is big volume. Backup wise, it is big number of sessions or data to be indexed. If you have single backup solution, let's say NetWorker, metadata for snapshots might be big. If using traditional backup form, directly or via proxy for snapshots, expect you indices to rise. With rise of vitualizations and neverending race to minimize costs, I see more and more customers building application hotels which hosts sometimes 50+ databases for example. This has its toll in backup world. While few years ago you could still see cold backups for databases, today they are almost extinct. Imagine if you have 100 databases with archive logs running every hour. Let's assume each runs will generate 10 sessions. That would 1000 sessions per hour or 16.6 sessions per minute. This is simplified example Add to those file system backups, database backups, some NDMP and then some ad hoc stuff plus daily restores and suddenly you get server which is 24/7 busy doing things all the time. Running NMC in such environment is challenge as real time monitoring is not easy under such conditions (actually, in environment I work I had to script NMC monitoring capabilities to have it on demand as NMC just adds to load on server). Add to this equation products supporting snapshots like PowerSnap and NMM where snapshot cleanup has to be initiated from client and you get all sorts of wonderful things (eg. deletion clashing on server with server initiated nsrmmdbd operation during relabeling for example which will either break relabeling or snapshot deletion). Bottom line is, NW code and database need some refresh to cope with modern ways of backups. It is not easy task and I certainly do not wish to be in skip of PM or engineers on this one, but there is much to be done to make things better.
So what did EMC World 2011 bring on table in terms of backup? Nothing really. I would say highlight was RecoverPoint only. The rest, including keynote, was under expected level and failed to impress as rest of EMC offering (my personal impression). While I didn't get a chance to visit vLab, I was told there was nothing for NetWorker too. It is exactly this approach which keeps people wondering about future of backup within EMC. While EMC people like to state they have all best of solutions thus you can choose, customers on the other hand find this offering a bit confusing as each of them would have some limitation other would not. Yes, they can be integrated at certain level, but you are integrating two backup solutions then which from cost perspective is not attractive to many.
This was a second year we have been told that BURA is leading department within EMC. With that in mind, I would expect more about BURA news to be presented on EMC World, but for some reason this year that was not the case. I hope next year we will see more and some of my concerns will be addressed. When talking about big data, I expect big data backup solution to be presented as well. I hope we will see more exciting stuff, unified "EMC backupware" and more things on roadmap. I remember during BoF session few questions got answered with "we discuss this, but it is not on roadmap". It felt as catching up game continues and this should not happen. It was also a bit strange to see one NetWorker gathering made shortly after EMC World (announced here) where details of things to come and roadmap were discussed while this has been left out on BoF session for example. I also hope to see EMC backup solution to integrate more with other storage solutions by other vendors - in this area some other backup solution vendors have already taken some steps.
After this year at EMC World, I have no doubts that storage and backup are two sides of the same coin. But backup side is also much more as there are legacy systems, different storage options, etc... Storage offering was great this year and now I look forward to see backup to catch up.
0 events found

