Thanks for your help, guys. To try & answer your questions :-
Kelleg: 1. Yes, all 8 backups on RAID Group 3 start at the same time (00:10 early Saturday morning), along with the backups to RG2 & RG6. The destinations are three 4+1 RAID-3 groups (each 1 LUN) comprising 5x750GB SATA-II disks ¿ four of the 8 RG3 LUNS (and two small Windows servers) backup to RG10, and the other four (plus 7 small Windows servers) backup to RG12.
2. The one RG2 LUN and the two RG6 LUNs (also one of the Exchange databases and 21 small Windows servers) backup to RG13. But the big Exchange backups (400GB each mailserver) don¿t conflict with the RG2/3/6 backups ¿ they back up to disk 19:00-21:00ish daily, and duplicate to tape 21:00-23:00ish ¿ excellent performance). And the small Windows servers (application servers, domain-controllers, etc) backup at quite good speeds even while the SAN disks are chugging along slowly.
3. Hmm, I hadn¿t seen those recommended limits before. A typical RG3 disk (SATA-II) shows Total Throughput 90-110 IOPS throughout the backup period (from start of the 8 jobs till the end of the last (biggest) one 41 hrs later. The RG2 disk (SATA-II) averages about 90 IOPS and the RG6 disk (FC2) averages about 80 IOPS. For the destination disks, RG10 averages about 10 IOPS, RG12 disk averages 50 IOPS (range 20-90) and RG13 averages about 10 IOPS. All this is mostly Read IO ¿ for the backup LUNS (RGs10/12/13) it looks as though the peak values may coincide with the duplications to tape, but I need to investigate this more. Looking at Total Bandwidth MB/s, RG2/3/6 show values 4-6, 2-5, 3-5 MB/s (mostly Read) and RG10/12/13 show values 2-4, 6-16 & 2-4 MB/s (mostly Write).
4. Dirty Pages % ¿ sorry, I can¿t see that parameter anywhere, even when I select SPs via the SP tab, or look at them via the LUN tab. Maybe I omitted to tick some parameters when I enabled & started logging???
Jps00: Good point about the buses ¿ all the RGs we are talking about here, except RG6, are on Bus 0. How can I check Bus utilisation - I can¿t find this parameter ¿ or should I look at something else? Please see 1 above for the RG3 setup; yes ¿ the 8 RG3 backup jobs start together (also a dozen or so small Win server jobs).
The NetBackup setup is Enterprise Version 6.5.4 with 1 Master/media server, plus 1 extra media-server, running Win 2003 Server R2, 64-bit, SP2. Both are Dell PE2950-III 64-bit, single-processor (Xeon 2330MHz) Quad-core with 8GB RAM, SAN-attached. RG10 that receives half the RG3 backups is owned by the extra media-server; the other backup LUNs are owned by the Master/media server. What exactly did you want to know about the Catalog setup? I upped the NetBackup Client Communication buffer sizes to 512KB & server NET_BUFFER_SIZE to 513KB; the server NUMBER_DATA_BUFFER is 64 and SIZE_DATA_BUFFERS is 524288. I don¿t think NetBackup itself is the bottleneck, as performance for Exchange full backups is terrific - every day, two 400GB mailbox servers are staged to disk (2 hrs in parallel) and the images copied to tape at two locations each with a pair of fibre-attached LTO4 tape drives on the SAN (another 2 hours).
Thanks for suggesting the ¿Best Practices ¿ 28.5¿ ¿ I¿ll read it carefully. As this is more concerned with CX4¿s I had just used the older ¿BP¿26¿ document for our CX3.
Additional information ¿ the RG3 LUNs total 6.8 million files, the RG2 LUN is 3.6 million, and the two RG6 (FC) LUNs add up to 2.8 million files. Users¿.!! (sigh!) Small file-sizes could well be a major factor, but we don¿t backup Windows profiles , so no cookies or Temporary Internet Files.
It has been suggested that NetBackup SAN Client licences might speed up backups - the RG2/3/6 LUNs are shares on 4 clustered Dell NX1950 64-bit machines (dual quad-core Xeon 2330MHz, 8GB RAM) running WUDSS 64-bit. But I¿ve read that this may be true for slow backups of big files, not necessarily for lots of small files.