Replication is a way of keeping data synchronized in multiple databases. Implementing and maintaining replication might not be a simple proposition: If you have numerous database servers that need to be involved in various types of replication, a simple task can quickly become complex. Implementing replication can also be complicated by the application architecture.
Microsoft SQL Server has supported replication since version 6.0, and setting up replication has become significantly easier over the years (in fact, 99 percent of replication setup can be accomplished by clicking through replication wizards). However, replication involves much more than setup, and unfortunately there aren't many sources of information for implementing and troubleshooting it. The only way to learn replication is to dig through the knowledge base articles and figure things out on your own.
Replication Terminology
SQL Server replication is commonly described by using the publisher/subscriber metaphor. A database server that makes data available for replication (source server) is referred to as the publisher; a collection of one or more database objects that are enabled for replication is called a publication. SQL Server supports replicating tables, views, stored procedures, and user-defined functions.
One or more servers that get data and/or transactions from the publisher are called subscribers. Replication is managed by the system database, which by default is called distribution. A distribution database—which can reside on the publisher, subscriber, or on a separate server—is created when you configure replication.
The server that hosts the distribution database is referred to as the distribution server or distributor.
It is recommended that you always use a server that is dedicated to distributing transactions. Thus, the distribution server should be used for nothing but replication.
Each database server can act as a publisher and subscriber at the same time. Each publisher can have multiple subscribers, and each subscriber can receive transactions from multiple publishers.
You should also become familiar with replication agents, which are implemented as SQL Server jobs that perform a particular task according to their schedule.
Snapshot Replication
Snapshot replication simply takes a "snapshot" of the data on one server and moves that data to another server (or another database on the same server). After the initial synchronization snapshot, replication can refresh data in published tables periodically—based on the schedule you specify. Although snapshot replication is the easiest type to set up and maintain, it requires copying all data each time a table is refreshed.
Between scheduled refreshes, data on the publisher might be very different from the data on subscriber. In short, snapshot replication isn't very different from emptying out the destination table(s) and using a DTS package to import data from the source.
Transactional Replication
Transactional replication involves copying data from the publisher to the subscriber(s) once and then delivering transactions to the subscriber(s) as they occur on the publisher. The initial copy of the data is transported by using the same mechanism as with snapshot replication: SQL Server takes a snapshot of data on the publisher and moves it to the subscriber(s). As database users insert, update, or delete records on the publisher, transactions are forwarded to the subscriber(s).
To make sure that SQL Server synchronizes your transactions as quickly as possible, you can make a simple configuration change: Tell it to deliver transactions continuously. Alternatively, you can run synchronization tasks periodically. Transactional replication is most useful in environments that have a dependable dedicated network line between database servers participating in replication. Typically, database servers subscribing to transactional publications do not modify data; they use data strictly for read-only purposes. However, SQL Server does support transactional replication that allows data changes on subscribers as well.
Merge Replication
Merge replication combines data from multiple sources into a single central database. Much like transactional replication, merge replication uses initial synchronization by taking the snapshot of data on the publisher and moving it to subscribers. Unlike transactional replication, merge replication allows changes of the same data on publishers and subscribers, even when subscribers are not connected to the network. When subscribers connect to the network, replication will detect and combine changes from all subscribers and change data on the publisher accordingly. Merge replication is useful when you have a need to modify data on remote computers and when subscribers are not guaranteed to have a continuous connection to the network.
Replication can be used effectively for many different purposes, as discussed in the following sections.
Providing High Availability
Occasionally, you might consider using replication for high availability; that is, to replicate transactions from the main server to a standby server. If the main server fails, you can then point your data sources to the standby server. Be aware that using replication for high availability takes careful planning and testing. Replication does not provide any sort of automatic fail-over. SQL Server supports other methods of providing high availability, such as clustering and log-shipping, which might be more appropriate for your environment.
Transporting Data
Another common use for replication is to simply move data changes from publishers to subscribers. This method is particularly useful for moving transactional data to a data warehousing server, in which it is transformed and aggregated for OLAP reporting. SQL Server provides other ways of transporting data: DTS, BCP, BULK INSERT statements, and others. Be sure to carefully consider the alternatives before implementing replication because other solutions might be cheaper or even faster than replication.
Replication needs to be planned carefully. Setting things up is easy, but there is no magic UNDO button that will reverse all your actions. Therefore, be sure to test your plan thoroughly before implementing a replication solution. The following sections discuss some of the planning steps necessary for transactional replication.
Comments