Dell Power Solutions

Dell Power Solutions

Dell Magazines

Dell Magazines

Dell Power Solutions

Dell Power Solutions
Subscription Center
Advertise
Submit an Article
Magazine Extras

Dell Insight

Dell Insight Archives

Migrating Sun-Based Oracle Databases to Dell PowerEdge Servers Running Red Hat Linux

Dave Jaffe, Ph.D., and Todd Muirhead (August 2002)

Dell compared data-mining performance of the DellTM PowerEdgeTM  6450 server, the Dell PowerEdge 8450 server (both running Red Hat® Linux®  Advanced Server 2.1), and the SunTM EnterpriseTM  4500 server running the SolarisTM  8 operating system. A set of eight intensive queries was run against an Oracle9iTM  database that resided on each server. The Dell servers ran the queries faster and had a better price/performance ratio than a comparably equipped Sun server.

In a previous Sun-to-Dell migration study,1 the DellTM  Technology Showcase team built an online query engine using Oracle®  Portal on Oracle9iTM  Application Server to handle many simultaneous small requests (Top 10 lists of various baseball statistics for a given year and league). The tests showed that four- and eight-CPU Dell PowerEdge™ servers running Oracle9i Database on the Microsoft® Windows®  2000 Advanced Server operating system could handle a large number of simultaneous users more easily than a comparably configured database system running on the SunTM EnterpriseTM  4500 server.

This article describes a study that compares pure system performance for data mining on the Sun server running the SolarisTM  8 operating system against Dell servers running Red Hat® Linux®  Advanced Server 2.1. For this study, the Dell team designed eight intensive queries to the same baseball database and ran the queries simultaneously against each database server to simulate multiple analysts accessing the database.

Hardware configuration of the database servers

The Sun server for both the four-CPU and eight-CPU configurations was an Enterprise 4500 midrange server running the July 2001 version of Solaris 8. In both configurations, the Enterprise 4500 used 64-bit UltraSPARCTM  II 450 MHz processors. Both the four-CPU Dell PowerEdge 6450 and the eight-CPU Dell PowerEdge 8450 ran Red Hat Linux Advanced Server 2.1 on Intel® Pentium®  III XeonTM  processors at 700 MHz, based on the Intel 32-bit architecture. Figure 1 shows the configuration details.

Figure 1. Database server comparison: Dell PowerEdge 6450 and PowerEdge 8450 versus Sun Enterprise 4500
Figure 1. Database server comparison: Dell PowerEdge 6450 and PowerEdge 8450 versus Sun Enterprise 4500

The Sun and Dell database servers had similarly configured external SCSI storage. The Sun server was attached using Fibre Channel to a Sun StorEdgeTM  A3500FC controller that drove a single StorEdge D1000 disk pod containing eight 36 GB SCSI disks. The Dell PowerEdge 6450 and 8450 servers were attached to a Dell PowerVaultTM  210S SCSI disk pod through Dell PowerEdge Expandable RAID Controller 2, Dual Channel, (PERC 2/DC) Peripheral Component Interconnect (PCI) cards. Each PowerVault 210S contained ten 18 GB disks. The RAID configurations for the systems, shown in Figure 2 , were created to be as similar as possible; however, the RAID configurations differed slightly because each server had a different number of internal disks.

Figure 2. RAID configurations for the database servers
Figure 2. RAID configurations for the database servers

Configuration of the database software

The Dell team used the Oracle Universal Installer to install Oracle9i Database with the same options on both the Red Hat Linux Advanced Server 2.1 and Solaris 8 operating systems. Oracle9i installation was completed without creating an initial database. Using similar settings on the two platforms, the team ran Oracle Database Configuration Assistant to create a new database instance configured with the shared server option. The control files, log files, and data files were placed on different partitions of the external storage (see Figure 3 ). The team also applied the appropriate patch set on each platform to upgrade Oracle9i Database to the 9.0.1.2.0 level.

Figure 3. Oracle database layout
Figure 3. Oracle database layout

Figure 4 lists the database initialization parameters. The Sun Enterprise 4500 and the PowerEdge 8450 servers were configured similarly to facilitate an eight-CPU comparison. Parameters on the PowerEdge 6450 database were scaled down to reflect the lower processing power. The parameter DB_WRITER_PROCESSES (number of database writer processes) was set equal to the number of processors in the respective servers.

Figure 4. Oracle9i tuning parameters
Figure 4. Oracle9i tuning parameters

Test methodology: Database queries and migration

A database of baseball statistics, Baseball Archive Database 4.0, was obtained from http://www.baseball1.com/statistics in Microsoft Access form, exported as comma-separated values, and then loaded into Oracle tables with the sqlldr command. The database contains batting, pitching, and fielding data for 15,558 players from 1886 to the present.

Developing the queries
Using Oracle SQL for Analysis functions, the Dell team wrote a set of eight data-mining queries that represent fairly CPU-intensive requests typically made by baseball analysts; each query takes at least 20 seconds on an otherwise unloaded PowerEdge 6450. These queries are of three types.

The first type determines sustained performance over five consecutive years for several statistics (home runs, batting average, earned run average, and so on); see the sample query 5yr_bl.sql at the end of this article. To generate totals of each statistic for each player, the query applies the sum(column) function to the batting table partitioned by the unique player identification and ordered by year, with a five-year moving window. Totals are calculated for each year plus the four previous years. The query then sorts the results and displays the top 10 best five years in each category.

The second type of query calculates which players had the best overall seasons historically, based on a mixture of batting or pitching statistics (see the sample query bs.sql at the end of this article). The Oracle analytical function rank() is used to generate a new database view that lists each player, the year, and the player's all-time rank for several statistics that year. The ranks are then summed and ordered lowest total rank first, thus generating lists of all-time best seasons. To determine overall offensive performance over a season, for example, the query factors in home runs, runs batted in, hits, runs scored, and batting average; the leader in this category would have the lowest sum of that season's all-time ranks for these five statistics.

The third query (see the sample query yr_best_al.sql at the end of this article) uses the Oracle Procedural Language/Structured Query Language (PL/SQLTM ) to loop through the years 1901 to 2000 and generate Top 10 lists of overall offensive prowess for each year using the rank() function. The players are listed in order of lowest total rank for home runs, runs batted in, and batting average-the so-called "Triple Crown."

The eight queries used in the test are all variations of these three queries. The full set of queries is available from the authors.

Migrating the database
Migrating the baseball statistics database from the Sun database server to the two Dell servers was easily accomplished through the Oracle export and import commands. On the source database (the Sun system in this case), the Dell team created a database dump file using the following command:

% exp baseball/password parfile=exp.params

where user "baseball" is the owner of the baseball database and the export parameter file exp.params contains

TABLES=(MASTER, TEAMMASTER, BATTING, PITCHING, FIELDING)
FILE=baseball.dmp

The baseball.dmp file was copied to the two Dell systems and then the import command was run:

% imp baseball/password parfile=imp.params

where the import parameter file imp.params contains:

FILE=baseball.dmp
IGNORE=Y
FULL=Y

The import/export process also can be executed through Oracle Enterprise Manager, a graphical management tool.

Results of performance tests

The workload consisted of running the eight queries simultaneously and measuring the elapsed time with the Oracle timing function. To achieve simultaneous execution, the team created a simple shell script that used the sql plus command and started each query in background mode (see Figure 5 ). The script ran three times directly on each database server, and the results were averaged.

Figure 5. The run8.sh test script
Figure 5. The run8.sh test script

In the four-CPU comparison (see Figure 6 ), the Dell PowerEdge 6450 completed the eight queries in an average time of 39.58 seconds, 69 percent faster than the Sun Enterprise 4500 time of 66.85 seconds. The tested Dell system was 6.17 times less expensive than the Sun, giving Dell an overall price/performance advantage of 10.43x.

Figure 6. Four-CPU test results: PowerEdge 6450 versus Sun Enterprise 4500
Figure 6. Four-CPU test results: PowerEdge 6450 versus Sun Enterprise 4500

The eight-CPU comparison between the Dell PowerEdge 8450 and the Sun Enterprise 4500 showed similar results: the eight queries took an average of 22.92 seconds on the Dell system, 55 percent faster than the 35.54 seconds taken by the eight-way Sun (see Figure 7 ). With a price 4.38 times less than that of the comparably equipped Enterprise 4500, the PowerEdge 8450 has a 6.79x price/performance advantage.

Figure 7. Eight-CPU test results: PowerEdge 8450 versus Sun Enterprise 4500
Figure 7. Eight-CPU test results: PowerEdge 8450 versus Sun Enterprise 4500

Dell PowerEdge: Higher performance for the dollar

Large databases built on state-of-the-art database management systems such as Oracle9i are vital to modern enterprises. The ability to quickly retrieve useful information from these databases is paramount.

Organizations can obtain the required high performance in a cost-effective manner by migrating Sun-based Oracle systems to Dell-based systems. Using realistic queries against an Oracle9i database of baseball statistics, Dell found that the industry-standard Dell servers running the open-source Linux operating system are as much as six times less expensive than the comparably equipped proprietary Sun server. Yet the Dell servers outperformed the Sun server by as much as 69 percent, resulting in more than 10 times the performance per dollar for the four-CPU systems and almost seven times for the eight-CPU systems.

For fans of sustained performance and baseball, Mark McGwire had the best five-year stretch with the long ball, averaging 56.8 home runs per year from 1995 to 1999, Rogers Hornsby batted .402 between 1921 and 1925, Lou Gehrig averaged almost 163 runs batted in per year from 1930 to 1934, and Sandy Koufax had an earned run average of 1.949 from 1962 to 1966, leading all pitchers after 1920.

Acknowledgments

The authors would like to thank Sean Lahman for use of his baseball data.

Dave Jaffe, Ph.D. (dave_jaffe@dell.com) is a senior consultant on the Dell Technology Showcase team, specializing in cross-platform solutions. Dave has a Ph.D. in Chemistry from the University of California, San Diego, and a B.S. in Chemistry from Yale University.

Todd Muirhead (todd_muirhead@dell.com) is an engineer consultant on the Dell Technology Showcase team. Todd holds a B.A. in Computer Science from the University of North Texas and is MCSE+Internet certified.

FOR MORE INFORMATION

Jaffe, Dave, Ph.D. "Migrating Sun-based ISPs to Dell PowerApp.web Servers Running Red Hat Linux." Dell Power Solutions , Issue 4, 2000

Jaffe, Dave, Ph.D. "Migrating Sun-Based ISPs to Dell PowerApp.web Servers Running Windows 2000." Dell Power Solutions , Issue 1, 2001

Jaffe, Dave, Ph.D. "Migrating Sun-Based BEA WebLogic Application Servers to Dell PowerEdge Servers." Dell Power Solutions , Issue 2, 2001

Jaffe, Dave, Ph.D. "Migrating Sun-based BEA WebLogic Application Servers to Dell PowerEdge Servers." Dell Power Solutions , Issue 3, 2001

Jaffe, Dave, Ph.D., and Todd Muirhead. "Migrating Sun-Based Oracle Databases to Dell PowerEdge Servers." Dell Power Solutions , May 2002

Sample Queries

Figure A. The 5yr_b1.sql query: Best individual statistics over five sequential years
Figure A. The 5yr_b1.sql query: Best individual statistics over five sequential years

Figure B. The bs.sql query: Best individual seasons of all time
Figure B. The bs.sql query: Best individual seasons of all time

Figure C. The yr_best_al.sql query: Best offensive players per year in the American League
Figure C. The yr_best_al.sql query: Best offensive players per year in the American League

© 2010 Dell | About Dell | Terms of Sale | Unresolved Issues | Privacy | About Our Ads | Dell Recycling | Contact | Site Map | Feedback

snDWW3