Start a Conversation

Unsolved

This post is more than 5 years old

G

5329

September 22nd, 2014 00:00

Data Domain VTL - multipathing, path failover support with IBM's atape/IBMtape device driver?

Hi,

I wonder whether anyone out there has some experience in this:

In a project, we plan to utilize Data Domain in VTL mode (FC). The customer uses TSM as backup software (AIX backup server, AIX/x86 Linux/Windows LAN-free clients), and a second environment to support with the DD VTL is IBM System i (AS/400). An old EMC disk library (CDL/EDL 4xx6 series) is currently in use, that is EOL/EOSL and to be replaced with a DD. The customer also uses high-end IBM tape libraries extensively (Jaguar in TS3500/3584) and the path failover feature of IBM (built into the IBM tape device driver - atape / IBMtape) is also in use (DPF data path failover, CPF control path failover).

The direct competition of the proposed DD VTL is a VTL solution from IBM. The customer prefers a VTL solution that has at least some high-availability capabilities (even if the appliance, the "engine" is single): e.g. path failover with the IBM tape device driver.

Does someone have experience in this? (The only info from EMC documentation I found is that DPF is not supported, it is to be disabled when configuring a DD VTL on the backup server.) Is any implementation of this IBM path failover, or anything like this (automated path failover with DD VTL), supported or at least technically viable?

Thanks in advance.

Geza

14 Posts

September 25th, 2014 08:00

Hello Geza,

indeed DPF is currently not supported for TSM and DD VTL and will not work. Maybe there will be a solution for this in a future release of DDOS. Is a NFS based implementation a possible workaround for you? I know this is not always wanted by the customer, as they have an existing FC infrastructure.

For IBM i multipathing for tape devices is not supported by the operating system.

Best regards

Peter

26 Posts

September 29th, 2014 07:00

Dear Peter,

thanks for your answer. NFS does really seem to be an option: the customer currently utilizes an old SAN-based VTL from EMC (EDL/CDL 4x06), want to stay with an FC-based approach, want a full SAN-based solution for its open systems and IBM i environment etc. So really many arguments against an NFS-solution.

Balázs, Géza

Mobile: (+36) 70 6024536

1 Attachment

14 Posts

October 14th, 2014 05:00

Let me know if you need help implementing Data Domain for IBM i.

1 Message

September 28th, 2015 05:00

Hello Gents,

Would you please tell me is there any change on this matter. We have to implement DD VTL(DDOS 5.5.0.9-471508) emulating IBM TS3500 with TSM 7.1 on AIX 7.1 and multipathing is required.

Thank you in advance!


Georgi

208 Posts

October 1st, 2015 00:00

For me the question would really be... why do you need path failover for a VTL tape drive?

Path failover is important for physical tape because they are expensive to buy, and so you generally have less of them and they therefore have a greater impact when not contactable because a path failed.

You do not have the restriction with a VTL because if you want another tape drive - you create it for free...

If you have a host with 2x FC initiator ports, and, say, 2x DD target ports - you should create a couple of drives on each path (likely via 2 different SAN's) and so now, if you lose an initiator, target or FC switch - you still have 2 drives to use.

If their resilience requires more drives in the failed situation - just create 4 drives on each path because... they're free...

DPF/CPF is IMHO of very little relevance to virtual tape drives in any VTL type system (EDL included) due to that fundamental difference in that your investment is not in physical tape drives but in a DD system that can create many, many robots and tape drives.

Specifying more FC HBA's/SLIC's in the DD will give you more options when creating/assigning virtual tape drives.

I doubt we will ever bother with CPF/DPF on DDR's for that reason.

Now... failover of a virtual robot... that's a whole new topic ;¬)

Regards,

Jonathan

26 Posts

October 1st, 2015 07:00

Hi,

after this project was done (I mean, the implementation phase), the customer raised the question about multipathing some 1 month ago. We forwarded this question to EMC, opened an SR and the quite straight answer was that it is NOT SUPPORTED with TSM.

We were sent this knowledgebase article:

Article Number:000181885

181885

https://emc--c.na5.visual.force.com/apex/KB_BreakFix_1?id=kA1700000000tSd

Notes

TSM Multipathing with DD VTL not supported

SYMPTOMS

· Very slow backups

· Many scsi resets in kern.info

APPLIES TO

· All Data Domain systems licensed for VTL

· All Software Releases

· VTL Protocol

· Tivoli Storage Manager (TSM) 5.5.3, 5.4.x, 5.5.x, 6.1.x

PURPOSE

Explain how to reconfigure a Data Domain system to prevent problems due to lack of multipathing support in TSM.

CAUSE

TSM does not support multipathing with Data Domain systems.

SOLUTION

· TSM only allows one fibre channel path to a device. If it sees the same device serial number over two FC paths it rejects that device as invalid.

Example: With one TSM library defined, the changer for the library given access in two separate access groups and it is being accessed over VTL port 5a in the first access group and over port 4b in the second access group. This is an invalid configuration and must be changed for TSM to access the Data Domain system. There are two alternatives to rectify this issue. The first is to remove the changer from one access group. The other is to make both access groups use the same VTL port for the changer.

· What can also cause problems is that TSM does not allow for a secondary path to a Fibre Channel device, therefore you cannot define secondary paths in the access groups for any of the tape drives or the changer. The proper configuration in the access group is to set the secondary ports on all devices (tape drives and changer) to "none". That is the only proper configuration for TSM.

Note: Example of an incorrect TSM configuration within a VTL Access Group where both groups are accessed by the same TSM Server on different initiators:

Group: ag_tsm1

Initiator Alias Initiator WWPN

tsm1 10:00:00:00:c9:6b:c5:46

Device Name LUN Primary Ports Secondary Ports In-use Ports

1 Attachment

2 Posts

May 4th, 2023 22:00

That's a very niaive and narrow minded response.

Both in our physical ATL (TS3500 / 3592 drives) and Virtual Tape Library (ProtecTier vTS3500 / vLTO-3 ) we had some form of non-disruptive datapath and control path failover both for Windows and AIX operating systems across two independent SAN fabrics.

As it stands now  Data Domain VTL ( vTS3500 / vLTOx ) defined the drives may have multiple data and control paths to the host but without path failover support they are all PRImary paths, whereby the extra paths should be able to be set as ALTernate paths in the OS for non-disruptive automated failover. 

We are a TSM/Spectrum Protect customer on AIX with large 24x7 enterprise needs. Non-Disruptive failover is critical to our daily business and nigthly batch.

Think scenario #1.... Planned Maintenance for SAN Fabric#1 switch.
All Drives via SAN#1 paths could be taken offline ahead of time, Changer Control path via SAN#1 be reconfigured via SAN#2 and operations could continue. If however, drives are in use 24x7, taking drives offline at the application level would require catching drives to be unmounted and changing the control path to the changer would require no mounts to be in progress.

Scenario #2, Customer is in the middle of a really large DB / Exchange Mailstore recovery, may be running for 24+ hours
Suddenly Application Initiator port (HBA) goes offline for SAN#1. All LTO drives would then fail recovery tasks if virtual tapes mounted. Recovery operation may need to be started from the begging again. Virtual tape will now be logically stuck as it was mounted in the drive that can no longer be ejected or contacted by the host. Host Control Path will also be down as only a single control path can be defined to TSM and normally SAN#1 Tape#1 would be defined by default for the Control Path. A Storage administrator would need to reconfigure the backup application control path to the library as no tapes can be mounted to any drives while the single control path has failed. DataDomain Administrator would need to manually eject stuck LTO tapes back into VTL slots ready to be mounted into drives that are still online before the restore process can be restarted again

I could write up a dozen more unacceptable scenrios irrespective of how many tape drives are defined in the VTL. Once a host initiator goes down in a fabric, that all drives withing that access group to the host initiator are offline, along with potentially the primary control path. 
The only thing that works really well is with NPIV enabled, end-points can fail between two System Address (within the same SAN Fabric) and the access groups / SAN zoning will not become redundant upon a DD port/hba failure and hence the PRImary path to the server OS will remain online and uninterrupted. But there is still no ability to keep a drive accessable across two SAN fabrics, or the control path accessable across SAN fabrics, meaning if a SAN switch is down for maintenance, or even a host initiator HBA port is offline there is no way to keep business operations running uninterrupted without administrator intervention.

Software support for automated datapath and control path failover in Atape device driver is what we need.

May 8th, 2023 02:00

You are aware you are responding to a post from 9 years ago, with the last response from 8 years ago?

You might have been "pampered" a bit with a IBM implementation of tape drive fail-over (which of course is something that definitely might help improve stability in a backup landscape, but it is by no means a standardized feature in the backup world). If memory serves me right various other data protection software and/or suppliers do not even offer such a feature to be able to automatically handle tape device failover.

https://www.dell.com/support/kbdoc/en-us/000042776?lang=en

"Although it may be necessary to restart backups and restores, some applications are able to continue backups during port failover/failback.  Tivoli Storage Manager (TSM) and NetBackup have been observed to do this on most operating systems."

Which is stated in a somewhat older Dell KB article above. To be honest, I am not following up on that how things actually are nowadays, as we completely let go of physical and virtual tapes, years ago in favor of using a DD with ddboost only over the LAN/WAN. So might be a bit out of touch with current options and features in backup products wrg to tape drive failover.

When we still used DD VTL, we also spread the amount of tape drives assigned to one system over two DD endpoints. However the clients themselves only had one backup SAN connection by design, something that was also taken over from the times we used physical tape drives, so it was just one single backup fabric, not a dual SAN setup like was and is common with storage. So if one DD port would go down, you'd still have access to half of the drives, while when the single port of the client was affected, it would not have access top tape drives whatsoever. Again, all by design as backup was not regarded as critical (enough) to require dual connections on client end and have a dual fabric approach for the backup SAN.

"Software support for automated datapath and control path failover in Atape device driver is what we need."

If that is your actual sum-up of the matter at hand, then that is not up to Dell really is it? Or do I misinterpret that?

No Events found!

Top