Running Microsoft SQL Server 2000 on Dell SAN
By Todd Muirhead (Issue 3 2001)
Dell PowerEdge servers, PowerVault storage, and the storage area network (SAN) software that ties them together provide a cost-effective, rack-optimized platform for Microsoft SQL Server 2000. This article describes the setup and configuration of a Dell SAN with Microsoft SQL Server 2000 in a single-server configuration and two-node cluster configuration. It also includes recommendations for SQL Server database implementations on Dell hardware.
Systems running database software demand high performance, high capacity, and fault tolerance. This often results in requirements that can be met only with a storage area network (SAN). As the price of SANs continues to drop and storage requirements continue to increase, more enterprises are turning to Fibre Channel SANs.
Dell® SAN solutions enable customers to maximize the available storage and improve database availability to end-users. Dell combines all of the necessary software and hardware for complete system testing to ensure compatibility and easy setup and configuration.
SANs offer high data integrity and availability
A SAN storage solution allows the storage to grow with the database requirements. A key advantage of a SAN is that both storage capacity and number of attached servers can increase. Up to 20 servers can be added to a SAN just as systems can be added to an Ethernet TCP/IP network. This enables more servers to benefit from advantages of Fibre Channel storage.
Dell's SAN solutions use high-speed Fibre Channel networks and switches to connect servers with Dell® PowerVault® Fibre Channel storage. Current Fibre Channel technology uses dedicated full-duplex 100 MB/sec connections. Dell's SAN 4.05 now supports Intel® -based servers from other vendors such as IBM, Compaq, and Hewlett-Packard.
Test configurations for Dell SAN with SQL Server 2000
Dell chose two configurations to test Microsoft® SQL Server 2000 with SANs. Industry best practices were used to design and set up these environments for performance and availability. An entry-level configuration and a high-end two-node cluster configuration were both loaded with a sample Online Transaction Processing (OLTP) database.
A load simulation tool was used to simulate usage and drive the server to approximately 30 percent CPU utilization. Several backup and restore scenarios were tested to measure data throughput and time required for each scenario. This article provides an analysis of the results and recommendations.
Single-server, entry-level hardware
The entry-level Dell SAN is designed for growth. The storage is a single PowerVault 660F with 14 one-inch 18 GB drives. It also includes a PowerVault 130T SCSI Tape Library attached to the SAN via the PowerVault 35F Fibre Channel-to-SCSI bridge. A PowerEdge® 6450 server and the storage are attached to two PowerVault 51F fiber switches. Figure 1 shows the configuration.
Figure 1. Example SAN and SQL Server configuration
Dell SAN solutions have no single point of failure. The PowerEdge 6450 server has two QLogic® 2200 Host Bus Adapters (HBAs), each connected to a separate PowerVault 51F Fibre Channel switch.
Finally, each Fibre Channel switch is connected to a different storage processor on the PowerVault 660F. The storage processor is the Redundant Array of Independent Disks (RAID) controller for the storage array. Either storage processor in the PowerVault 660F can take full control if one fails. If any SAN component fails, an available path from the server to the storage across the fabric still exists.
Two-node cluster hardware
The server setup differs in a cluster configuration. Two PowerEdge 8450 servers are used to create a Microsoft Windows 2000 active/passive cluster, also known as a failover cluster.
The cluster servers are using Gigabit Ethernet to directly interconnect for cluster-specific communication. Each PowerEdge 8450 server also contains two QLogic 2200 HBAs that connect to each PowerVault 51F switch. The PowerVault 660F has two RAID controllers that each connect to one of the switches.
For additional storage capacity, a PowerVault 224F with fourteen 18 GB drives was added to the PowerVault 660F. The PowerVault 224F is connected with two fiber cables to maintain no single point of failure. Figure 2 shows this configuration.
Figure 2. Two-node cluster configuration
Future SAN growth
One significant advantage of a SAN is its ability to grow. The entry-level configuration has a single server and a single storage unit attached via Fibre Channel switches. This SAN uses Fibre Channel switches to connect the storage and servers.

In the two-node cluster configuration, two PowerEdge 8450 servers and a PowerVault 660F storage unit are linked with a PowerVault 51F switch.
A Dell SAN can include up to 20 servers. Although this requires additional fiber switches, all the servers connected to the SAN can still use the same storage. Storage units can be added as needed, up to seven PowerVault 224Fs per PowerVault 660F and up to eight PowerVault 660Fs in a single SAN.
The entry-level configuration could be expanded by adding a server for Microsoft Exchange® e-mail services or another Microsoft SQL Server database for a new application. Three new servers would require additional storage capacity that can be provided by adding a PowerVault 224F. Figure 3 shows this possible future growth configuration.
Figure 3. Possible future growth configuration
Software setup and configuration
Hardware must be set up and properly cabled before beginning software installation and configuration. This helps to reduce the number of problems reported by the software during installation and setup.
The SAN server needs Microsoft Windows 2000 Advanced Server with Service Pack 1, Microsoft SQL Server 2000, Dell SAN 4.0, Dell OpenManageTM , and VERITAS® Backup Exec® . The SAN 4.0 CD-ROM (shipped as part of SAN) contains the correct versions of the software components. A SAN 4.0 Compatibility Guide, included on the SAN CD, lists the required versions.
Software for the entry-level configuration
The software installation of the SAN server should follow this order:
- Windows 2000 Advanced Server. A typical installation of Windows 2000 Advanced Server with Service Pack 1 is needed. If the server has a PowerEdge Expandable RAID Controller (PERC) adapter, the current drivers must be loaded. It is recommended to put the swap file on a separate disk from the OS.
- QLogic 2100/2200 HBA drivers. Windows 2000 automatically detects and installs default drivers for the QLogic 2100/QLogic 2200. These drivers should be updated using the QLogic drivers found on the SAN 4.0 CD.
- QLConfig utility. The QLConfig utility, which provides configuration for QLogic 2100/2200 HBAs, is installed from the SAN 4.0 CD. Once installed, set one HBA as the primary path and another as the failover path.
- QLDirect utility. QLDirect, also installed from the SAN 4.0 CD, monitors the HBA that was set as the primary path. If it detects a failure, the utility switches traffic to the failover path.
- Array Manager. The Array Manager tool allows SAN administrators to create virtual disks of different RAID types on the PowerVault 660F/224F. It also controls the cache configuration of the storage processors, definition of hot spares, and all other configuration for the PowerVault 660F/224F. It is installed from the SAN 4.0 CD.
- Storage Consolidation. Storage Consolidation enables configuration of the logical unit number (LUN) masking function on the PowerVault 660F. LUN masking assigns specific LUNs (virtual disks) to specific servers. LUN masking does not assign the LUNs designated for a specific server; it hides (masks) all LUNs not designated for a specific server. To prevent data corruption, it is important that only one server be assigned and actually use a LUN at one time. This does not apply to cluster configurations because clusters have special drivers to prevent data corruption. Storage Consolidation is found on the SAN 4.0 CD.
- Create virtual disks and assign to server. Array Manager can create the desired virtual disks. SQL Server requires at least two virtual disks: one for the transaction log files and the other for database files. Once virtual disks are created, Storage Consolidation can assign them to servers on the SAN.
- Windows Disk Management. The Disk Management utility is located in the Computer Management administration tool of Windows 2000. Once Array Manager is installed, it replaces Disk Manager. All Disk Manager functions are located on the Disk View tab of the Array Manager interface. The virtual disks that were created and assigned will appear as unrecognized disks on the disk tab view. Windows 2000 must write a signature to the disk, then format the drive and assign it a drive letter and volume name.
- SQL Server 2000. The installation of SQL Server is easy and quick. Be sure to place transaction log files and database files on separate virtual disks.
- VERITAS Backup Exec. During installation of Backup Exec, it is important to select the tape drivers, Microsoft SQL Server Agent, Open File, and SAN shared storage options. These enable Backup Exec to provide the required functionality for backing up SQL Server databases that are actively being used.
|
Software for two-node cluster configuration
The software installation for the two-node cluster configuration is the same as the first eight steps for the entry-level configuration. Both cluster nodes require steps 1 through 8. The only exception is that the virtual disk creation and assignment must be done only once. The list below shows the additional steps necessary for the cluster configuration:
- Install Microsoft Cluster Service. Cluster services must be installed on both cluster nodes.
- Install MSDTC and configure cluster disk group. An SQL Server cluster requires Microsoft Distributed Transaction Coordinator (MSDTC) in an SQL Server cluster for distributed queries and two-phase commit transactions. It is installed by entering the command comclust at a command prompt. Once installed, use Cluster Administrator to ensure that MSDTC and all disks to be used by SQL Server are in the same cluster group. If the cluster disk is not in the same group, then SQL Server will not allow databases to be created on it. An SQL Server error will be generated indicating that SQL Server cannot use a disk that the SQL cluster service is not dependent on.
- Install SQL Server 2000. SQL Server 2000 is cluster-aware and can fail over all functions to the secondary node if a primary node fails. The SQL Server 2000 installation program automatically sets up the required software in both nodes. Installation creates a virtual SQL Server on the cluster. A name and IP address on the same subnet as the nodes must be provided for the virtual SQL Server during installation. This name and IP address are used for accessing the SQL Server, regardless of the cluster node it is running on.
- Install VERITAS Backup Exec. During installation, the VERITAS Backup Exec cluster application must be assigned a virtual server name and IP address that differ from the virtual SQL Server and the virtual cluster address.
|
For a complete step-by-step description of implementing SQL Server with failover clustering on Dell servers and storage, see the Dell Solution Guide entitled Microsoft SQL Server with Failover Clustering or visit online at http://www.dell.com/downloads/us/pedge/SQLClustering_Final.doc.
Configuring the virtual disk
To effectively configure storage into virtual disks requires a thorough understanding of the application that will be using the database. For SQL Server 2000, it is necessary to know the ratio of disk reads to writes for the application using the database.
The sample database and simulated load described in this article perform approximately 65 percent reads and 35 percent writes. This was determined by monitoring the database server while it was under load with Performance Monitor, which is located in the Administrative Tools folder for Windows 2000. The heavily used database is about 16 GB. The virtual disks for both configurations were designed using this information.
RAID levels
RAID allows multiple disks to be logically combined and appear as a single disk to the operating system. Combining disks improves performance because more read/write operations can be performed at the same time. The controller cards use the multiple drives to optimally read and write the data. The different RAID types define how data is stored and how that data is spread across the drives. In addition, most RAID types provide fault tolerance so that failure of a single physical disk does not cause the virtual disk or any of its data to be lost.
The PowerVault 660F supports these RAID types:
- RAID-0 stripes data across the drives with no fault tolerance.
- RAID-1 mirrors a single drive to another drive.
- RAID-5 stripes data across the disks and uses a parity calculation for fault tolerance. The parity information is stored on each of the disks and used to rebuild any single drive if a failure occurs.
- RAID-0+1 is striping data across multiple RAID-1 disks. RAID-0+1 is multiple RAID-1 combined with RAID-0. Fault tolerance is maintained because RAID-0 is striped across the RAID-1.
|
Disks can perform more sequential than random I/Os per second. Sequential I/Os require fewer physical disks, while more disks are better with random I/Os.
RAID types have different performance levels because of their fault tolerance implementation. RAID-0 has the best performance because it does not have any fault tolerance. RAID-5 has slower write performance because it uses the most sophisticated method for fault tolerance.
SQL Server performs best if the transaction logs and database files reside on separate virtual disks. Transaction logs are critical for restoring the database after a failure, and most reads/writes to the disks are sequential. The sequential nature of transaction log I/O allows high performance with fewer disks, which makes RAID-1 a good choice for these files. A RAID-0+1 can also be used if additional storage capacity is required.
Dell recommends RAID-0+1 for database files with a random I/O pattern and writes of 35 percent or higher. RAID-0+1 offers the best performance for random writes, but available storage equals only half of the total capacity of all the drives in the RAID-0+1 disk set. For example, a RAID 0+1 configured with ten 18 GB drives for a total of 180 GB of disk space will have only 90 GB available because of the data mirroring within the RAID-0+1.
Virtual configuration for entry-level setup
The entry-level SAN configuration described earlier included one PowerVault 660F with 14 drives. The application being tested required high random write database performance, which indicated that a RAID-0+1 was needed.
One disk was defined as a hot spare and the others were put into a disk group (see Figure 4 ). Three RAID-0+1 virtual disks were created in the disk group: 4 GB virtual disk as the transaction log, 53 GB virtual disk for database files, and 50 GB for database backup to disk.
Figure 4. Entry-level virtual disk configuration
Virtual configuration for two-node cluster
Three disk groups were created for the two-node cluster configuration because there were twice as many disks as in the entry-level configuration, and the disk requirements for a cluster are different.
Disk group 1. Four disks were divided evenly into two virtual RAID-0+1 disks: one for database transaction logs and the other for the quorum, used by the cluster to share configuration information among cluster nodes. The quorum virtual disk must be dedicated and cannot be used for other functions.
Disk group 2. Twelve disks were created as a single RAID-0+1 virtual disk for the database files.
Disk group 3. These remaining 12 drives were divided into a RAID-0+1 and a RAID-5. These two virtual disks were used for backup to disk. The RAID-5 provided more available disk space and the RAID-0+1 offered greater performance. Figure 5 shows this two-node configuration.
Figure 5. Two-node cluster virtual disk configuration
The backup/restore test environment
A requirement for any data intensive environment is that data does not get lost because of server or disk failures. This means scheduling regular tape backups and archiving tapes for restoring previous data states.
The time needed to perform a backup is a critical factor. The time window for backups is becoming smaller and smaller with online businesses and globalization. The need to back up data during active use is growing. SQL Server 2000 providesthe ability to perform online backups without taking the database offline.
Most database backups consist of full, differential, and transaction log backups. A full backup is a copy of the database at a given point in time. A differential backup is a copy of only the changes since the last full backup. A transaction log lists the transactions or changes that have been made to the database. This combination makes up a complete database backup and allows for a full recovery in the event of a failure.
The two configurations used for our tests have a PowerVault 130T Tape Library attached to the SAN with a PowerVault 35F Fibre Channel-to-SCSI bridge. In this configuration, the backup does not compete with other traffic on the Ethernet TCP/IP network because all of the data stays on the SAN using the full-duplex, dedicated 100 MB/sec Fibre Channel connection.
The testing scenarios
For the testing, we used the test configurations described earlier and ran some full and differential backup jobs to determine the time required for backup and restore activities with a 15 GB sample database. Backup was performed twice: once without any activity on the database server and again with the database server CPU utilization at approximately 30 percent. Since transaction logs are relatively small and require little time to back up or recover, they were not tested.
Test scenarios varied based on the server being idle or busy, backup to either tape or disk, and a full or differential backup. The idle backup was run when nothing else was running on the server, which allowed server and storage resources to be dedicated to the backup operation. To achieve a busy server state, a load simulation tool was used to create CPU utilization of approximately 30 percent for the database on the server. Backup and restore—both differential and full backups—were performed to and from both disk and tape.
All backup and restore operations to tape were done with VERITAS Backup Exec 8.5, which has an SQL Server agent that allows it to interact with the database server. The agent option must be selected during installation of Backup Exec. Note: When restoring a backup with VERITAS Backup Exec, do not select the Recover Database option until restoring the last full, differential, or transaction log backup. Otherwise, it will be necessary to start the restore again to get the differential backup or transaction log applied to the restore.
For the tests, SQL Enterprise Manager performed all backup and restore operations to disk because VERITAS Backup Exec backs up to tape devices only. SQL Enterprise Manager requires that both the differential and full backups be performed at the same time; otherwise, the differential will not be restored.
Test results show high throughput for backup to disk
In our tests, we used the PowerVault 130T Tape Library attached to the SAN via the PowerVault 35F for the backup jobs. The tape library can have up to four drives installed, which could increase throughput up to a factor of four. However, for our tests, we installed a single DLT 7000 tape drive in the library. Figures 6 and 7 show the results for a single DLT 7000 drive.
Figure 6. Backup throughput for a single DLT 7000 tape drive
Figure 7. Restore throughput for a single DLT 7000 tape drive
In the backup and restore tests, throughput was measured in megabytes per minute. A higher throughput number is better because more data is processed in a shorter time period. The use of more tape drives would increase the throughput.
The results show that the highest throughput was achieved with an idle database server and the backup to disk. The throughput shows very little difference between the idle server and the busy server when backing up to tape. Figures 8 and 9 provide backup and restore test results.
Figure 8. Backup data test results
Figure 9. Restore data test results
Analysis: Backup to disk versus tape backup
Backup to disk provides a significant improvement in throughput over tape. A backup to disk can reduce the overall backup time to one quarter of that required if using a single tape drive. An online backup to disk can actually be slower than a backup to tape if the database and backup disk are sharing the same physical drives.
In the entry-level backup to disk example, the throughput was dramatically slower when the database server was 30 percent utilized because the database and backup file were sharing the same physical drives. The two-node cluster example showed much higher throughput for the loaded server's backup to disk because the database and backup file were on separate physical disks.
Throughput results to and from tape were similar for all backup and restore operations. The decrease in performance between the idle and loaded backup tests was between 5 percent and 15 percent. This indicates only a small performance hit for online backup to tape.
Higher throughput and reduced backup time resulted from using multiple tape drives. An advantage to tape backup (in addition to lower cost) is the ability to archive tapes off-site in multiple locations, which decreases the chance of a catastrophic data failure.
The ability to perform an online backup to tape and maintain good data throughput makes this use of multiple tape drives the recommended method for SQL Server database backup. Backup to disk should only be used when backup time is the most critical factor.
Todd Muirhead (todd_muirhead@dell.com) is an engineer consultant in the Solution Enablement Lab and Showcase. He specializes in storage area networks and database systems. Todd holds a B.A. in Computer Science from the University of North Texas and is MCSE+Internet certified.
For more information
Garcia, Marci Frohock. Microsoft SQL Server 2000 Administrator's Companion . Redmond, WA: Microsoft Press, 2000.
Dell Power Solutions online:
http://www.dell.com/powersolutions/
Microsoft SQL Server home page: http://www.microsoft.com/sql
Dell Solution Guide entitled Microsoft SQL Server with Failover Clustering :
http://www.dell.com/downloads/us/pedge/SQLClustering_Final.doc