Heterogeneous Database Replication: SQL Server 2000 to Oracle9i
Hemant Deshpande and Rudy Lee Martinez (August 2003)
Before integrating databases from different vendors, administrators must establish, manage, and maintain connections across heterogeneous relational database management systems (RDBMSs). This article reviews considerations and techniques for updating data from a Microsoft® SQL Server 2000 database to an Oracle9iTM database.
The complexity and volume of mission-critical corporate data have been growing exponentially. Sales, marketing, finance, and manufacturing databases that house vital business information typically have been deployed over the years by many different teams that often do not understand database implementations existing elsewhere in the organization—or appreciate the value of creating a foundation for future development. As a result, data architects for new systems often must draw from information stores in legacy applications that were implemented on relational database management systems (RDBMSs) from different vendors.
Although heterogeneous database platforms present considerable hurdles to connectivity, the benefits of integration usually outweigh the obstacles. Data architects can eliminate redundant data entry, storage, and maintenance by using existing data stores when designing new systems. To accomplish this goal, administrators must link databases so that data can be transferred from one system to another. Several methods can facilitate the transfer, ranging from distributed transactions to transaction log shipping. This article reviews one such method, replication, which provides an elegant and efficient transfer mechanism.
Understanding replication terminology
Basic terms used to describe replication in this article include:
- Publisher: A Microsoft® SQL Server database is the publisher, which constitutes the primary source—the database whose data is to be replicated.
- Subscriber: A SQL Server database (homogeneous replication) or non-SQL Server database (heterogeneous replication) is the subscriber, which receives data from the publisher. The subscriber also can update the published data and direct it back to the publisher.
- Distributor: After receiving data from the publisher, the distributor server forwards the data to the appropriate subscriber(s). The distributor and publisher can reside on the same or separate servers.
- Article: As a database object that is the fundamental component of the replication, an article can consist of vertical or horizontal table partitions, or one or more entire tables.
- Publication: One or more articles compose a publication. A publication is simply the list of articles to be published to subscribing databases.
- Replication agents: Log reader, snapshot, and distribution agents perform the processing tasks of the replication. The log reader agent moves the data from the transaction logs on the publisher to the distributor. The snapshot agent runs on the distributor and creates the snapshot files. The distribution agent transfers the data from the distributor to the downstream subscriber databases.
- Linked server: The linked server acts as a pass-through proxy for the Oracle® subscription database.
Defining replication methods
Microsoft SQL Server 2000 provides three types of replication methods: snapshot, transactional, and merge. Snapshot replication performs a bulk transfer of the entire data package from the publisher to the subscriber(s). This method has a high degree of latency, which is the time lag between source and destination updates. However, snapshot replication also retains a high degree of autonomy among participating databases. That is, subscriber databases are required to maintain connectivity to the distributor only during the replication session.
Transactional replication transfers smaller updates to the subscriber(s) more frequently, and with lower data latency, than snapshot replication. However, because updates occur more often, transactional replication also supports less autonomy than snapshot replication. Transactional replication is popular when unidirectional and frequent updates are required.
Merge replication is the third type of replication supported by SQL Server 2000. The primary difference between merge replication and the other two types is that subscribers can submit updates to the publisher; these updates are then applied to the other subscribers. Merge replication supports a high level of autonomy but pays a commensurate penalty in data latency.
Enabling SQL Server 2000toOracle9i replication
To create the environment for SQL Server 2000toOracle replication, Oracle9iTM client software must be installed on the SQL Server 2000 server, which acts either as a distributor or a linked server, or both (see Figure 1 ). To install the Oracle9i client software, administrators add the Oracle database Transparent Network Substrate (TNS) details, which contain information about the Oracle database server, to the tnsnames.ora file. Administrators then install the appropriate Open Database Connectivity (ODBC) or object linking and embedding database (OLE DB) drivers on the SQL Server 2000 server. Finally, administrators update the Microsoft Data Access Components (MDAC) version to 2.5 or later.
Figure 1. Configuration for SQL Server 2000toOracle9i replication
Setting up the distributor
To set up the distributor, administrators can select "Wizards" from the Tools menu in SQL Server 2000 Enterprise Manager. From the pop-up menu, administrators expand "Replication" and select the Configure Publishing and Distribution wizard.
The distributor and publisher can reside on the same or different servers. The first step in this wizard allows administrators either to create a new distributor or to use an existing one. If the local server option is chosen (distributor and publisher on the same server), then in the Specify Snapshot Folder step, administrators specify a snapshot folder or select the default. In turn, replication agents access the snapshot folder to transfer data from the publisher to the distributor.
If the remote server option is chosen (distributor and publisher on different servers), administrators then specify the network path for the snapshot folder in the Specify Snapshot Folder step of the wizard. For this option, administrators also must configure the distributor as a SQL Server 2000 database with the appropriate SQL Server 2000 and Oracle drivers. A confirmation message should appear when a new distributor has been configured correctly.
Setting up the publisher
To set up the publisher, administrators select "Wizards" from the Tools menu in SQL Server 2000 Enterprise Manager. After selecting the Replication wizard, administrators select "Create and Manage Publications" from the list of options. A pop-up menu appears from which administrators create a publication by selecting a database from the list of potential candidates.
A Select Distributor pop-up menu then appears from which administrators can select the distributor. After selecting the distributor, administrators configure the publisher in the Choose Publication Database step by first selecting the publication database, then specifying the publication type: snapshot, transactional, or merge. A snapshot publication periodically sends snapshots to the subscriber(s). A transactional publication sends data changes to the subscriber(s) incrementally. A merge publication is used when the publisher and subscriber(s) must merge the data periodically at the publisher.
Specifying the subscriber type
The Create Publication wizard now comes to one of the most important steps: the Specify Subscriber Types page (see Figure 2). For a SQL Server 2000toOracle9i replication, administrators must select the Heterogeneous Data Sources option, which enables heterogeneous replication. After the subscriber types have been selected, administrators specify the articlesthe fundamental building blocksthat will be part of the replication by checking the boxes for selected tables in the Specify Articles step of the Create Publication wizard. For transactional replication, only tables with primary keys can be published. After articles have been selected, administrators specify a publication name and description, completing the publication setup.
Figure 2. Specifying subscriber types in the Create Publication Wizard
Setting up linked servers
By providing connectivity between a SQL Server 2000 database and external databases, a linked server enables a primary SQL Server 2000 database to access data in multiple external databases. Linked servers are important for heterogeneous replication because they allow access to the Oracle database (the subscriber in the SQL Server 2000toOracle9i replication topology). The linked server can be configured from the security node of the database server menu tree in Enterprise Manager. Administrators can configure the linked server using SQL scripts or the Linked Server wizard in Enterprise Manager.
As with the distributor server setup, linked servers also must contain the appropriate drivers (see the "Setting up the distributor" section of this article). However, if the distributor and linked server reside on the same server, the driver and TNS file information will be required only on the combined server.
To use the wizard, administrators right-click the Linked Server node and select the New option. Administrators can set up the linked server for an Oracle9i database by using the Microsoft ODBC driver or OLE DB provider. Figure 3 shows a setup using the OLE DB provider. In the Linked Server PropertiesNew Linked Server dialog box, administrators click the General tab and enter the name for the linked server and the data source, which is the Oracle database alias.
The next step is to select the Security tab and enter the Oracle user login and password (see Figure 4 ). Administrators should use an appropriately secure password regimen when accessing data across linked servers. This step completes the linked server setup.
Figure 3. Entering linked server and data source names
Figure 4. Entering the Oracle user login and password
Testing the linked server
To verify that a linked server has been created successfully, administrators can select the Linked Server node in Enterprise Manager under the database server node. To see the names of the Oracle tables and view descriptions, administrators expand the linked server node. Selecting a table option under the linked server name displays all the tables associated with the destination; in the SQL Server 2000toOracle9i example, the heterogeneous database will be displayed.
Another method to verify successful linked server creation is to run a test SQL statement that returns the data from a selected table in the linked server, such as the following:
Select * from linkedservername..schema.tablename
Setting up subscriptions
Two types of subscriptions exist: push and pull. In push subscriptions, the publication server is responsible for replicating the data to subscribers. In pull subscriptions, the subscribing database controls the data flow from the publication server. Pull subscriptions are used when a higher degree of security is required.
To set up a subscription, administrators can select the Wizards option on the Tools menu and then select either the Create Push Subscription wizard or Create Pull Subscription wizard. Another alternative is to select "Create and Manage Publications" from the replication node of the wizard. Next, administrators select the publication from the publication server and click the Push New Subscription button.
The "Setting up linked servers" section earlier in this article described steps for configuring a linked server as a pass-through proxy for the Oracle subscription database. After clicking the Push New Subscription button, this linked server appears on a list of subscribers. Administrators should choose this linked server and then provide a name for it. By default, the Oracle database alias name will be used as the subscription database name.
Once subscribers are configured, administrators can arrange the distribution agent schedule. The replication times can be continuous or scheduled. This step completes the subscription database setup.
After the replication agents have been started, administrators can test the setup by making a change on the SQL Server 2000 publication database side and verifying that the change has been replicated to the Oracle database according to the schedule.
Managing heterogeneous data replication
The increasing complexity of enterprise data environments often requires sharing data between heterogeneous RDBMSs. SQL Server 2000 provides tools that help administrators create and maintain the transport mechanisms. By following the steps outlined in this article, administrators can more easily manage heterogeneous replication between Microsoft SQL Server 2000 and Oracle9i databases.
Hemant Deshpande (firstname.lastname@example.org) is a programmer analyst on the DellStar Enterprise team. He is responsible for database development and support for the project, which uses an Oracle database. Hemant has an M.S. in Industrial Engineering Information Systems from Arizona State University.
Rudy Lee Martinez (email@example.com) is the program manager for the DellStar Enterprise project. Rudy has an M.S. in Computer Science from Wake Forest University and is a Microsoft Certified Systems Engineer (MCSE), Solutions Developer (MCSD), Database Administrator (MCDBA), and Trainer (MCT).
For more information
Dell and Oracle: http://www.dell.com/oracle
Microsoft SQL Server: http://www.microsoft.com/sql
Comparing data type mappings
A thorough comparative understanding of the SQL Server 2000 and Oracle data types is critical to implementing a successful replication. Figure A shows the standard data type mappings that SQL Server 2000 uses for heterogeneous replication.
Figure A. Data type mappings between SQL Server 2000 and Oracle databases
Alternative procedures to set up linked servers
Administrators also can create and configure a linked server by executing procedures from the SQL Server 2000 Query Analyzer. This technique uses sp_addlinkedserver to set up the linked server, as follows:
sp_addlinkedserver [ @server = 'server_name' ]
[ , [ @srvproduct = ] 'product_name' ]
[ , [ @provider = ] 'provider_name' ]
[ , [ @datasrc = ] 'data_source' ]
[ , [ @location = ] 'location' ]
[ , [ @provstr = ] 'provider_string' ]
[ , [ @catalog = ] 'catalog' ]
- server_name: Name of the linked server
- product_name: Product name of the data source to be added as a linked server
- provider_name: The OLE DB provider corresponding to the data source
- data_source: Source interpreted by the provider
- location: Database location
- provider_string: Provider-specific connection string
- catalog: Catalog used while making connection to the provider
For example, administrators might enter:
Exec sp_addlinkedserver 'server_name','oracle', 'msdaora','data source alias'
The next step is the login setup for the linked server:
sp_addlinkedsrvlogin [ @rmtsrvname = ] 'rmtsrvname'
[ , [ @useself = ] 'useself' ]
[ , [ @locallogin = ] 'locallogin' ]
[ , [ @rmtuser = ] 'rmtuser' ]
[ , [ @rmtpassword = ] 'rmtpassword'
- Rmtsrvname: Linked server name to be used
- useself: A true value (the default) indicates that SQL Server 2000 users can use their credentials to connect to the remote server (Oracle9i) login. A false value requires rmtuser and rmtpassword commands to determine the name of the login
- locallogin: Name to log in on local server
- rmtuser: User name for logging into the Oracle database
- rmtpassword: Password for logging into the Oracle database
For example, administrators might enter:
Exec sp_addlinkedsrvlogin 'server name', 'false','NULL'. ''ora_login'', ''ora_password''
This process is useful when creating multiple linked servers.