Master-Slave Cascading Replication
Version 1.0.0
GotDotNet community for collaboration on
this pattern
Complete List of patterns & practices
Context
You are designing a replication solution for the
following requirements:
A replication set is to be replicated from a single
source to many targets that all require substantially the
same replication data.
The replicated data in the targets is read-only, or if
it is updated at the targets by any applications, it is
accepted that these updates can be overwritten by later
transmissions. This is called a master-slave relationship.
Hence the replication flow is one-way, from the source to
the targets, and neither conflict detection nor conflict
resolution are triggered at the targets because of target
changes.
Figure 1 summarizes this overall replication scenario.
Figure 1: Overall replication scenario
You know you could design direct replication links from
the source to each target, but the potential impact on the
source, and possibly the source availability, is a concern.
Therefore, you want to find another approach that reduces
this concern and is also an efficient way to replicate this
common replication set to many targets.
Problem
How can you optimize the replication to a set of targets
in a master-slave environment, and minimize the impact on
the source?
Forces
Any of the following compelling forces would
justify using the solution described in this pattern:
Too many passes on the source. Every replication
link that starts from a source requires a pass over the
replication set to acquire it. The resources (for example,
CPU time and I/O activity) needed for the required number of
passes might not be available on the source database server,
or they may cost too much.
Very large replication set. Even with a moderate
number of replication links to the source, the total
overhead on the source database server can become
unsustainable if the amount of data to be transmitted to the
targets is large.
Significant growth in replication needs anticipated.
Concerning both of the preceding forces, you anticipate a
significant growth in the number of targets and amount of
data to be transmitted. Therefore it is important to
implement a replication topology that can sustain the
predicted growth.
Need to offload replication set from source as
quickly as possible. Acquiring data impacts source
resources and you must minimize the duration of the impact.
For example, if you are replicating across a slow
communications link, you may prefer to offload the source
quickly and then replicate to the target from this offloaded
set.
No direct connection between source and target.
Due to your network topology, you might not be able to
directly link the source and target, but you can connect to
a third place.
The following enabling forces facilitate the move
to the solution, and their absence could hinder such a move:
Targets can tolerate the delays implied by
replication. The timeliness with which the data arrives
at any one of the targets depends on the replication link,
which frequently includes a network link. Adding more
replication links from the source to the final target
generally increases the delay until changes made to the
source replication set appear at the target.
Great similarity in the replication sets to be
replicated. The core of this pattern is that all the
replication data comes from the same original source
replication set. Within this fundamental constraint, each
replication link can have its own replication set to be
replicated, which can differ from the replication set of
other replication links. Although the structure differences
between each source/target pair might be fairly small, the
overall differences could be significant.
Data Replication requires that the source and the
target of every replication building block be very similar.
Master-Slave Cascading Replication requires the
similarity of all databases along the whole chain of
replication links to be high. Otherwise, an
Extract-Transform-Load (ETL) approach would be
more useful.
Solution
Increase the number of replication links between the
source and target by adding one or more intermediary targets
between the original source and the end target databases, as
Figure 1 shows. Specifically, this arrangement adds the
concept of cascade intermediary target/source (CITS)
to the topology, as Figure 2 shows. These intermediaries are
data stores that take a replication set from the source, and
thus act as a target in a first replication link. Then they
act as sources to move the data to the next replication link
and so on until they reach the cascade end targets (CETs).
Figure 2: Master-Slave Cascading Replication with a
single intermediate target/source
Figure 2 shows a very simple example of a Master-Slave
Cascading Replication topology. Each Acquire,
Manipulate, and Write (AMW) box in the figure represents a
replication link. For more information about the replication
building block, see the Data Replication pattern.
In general, several CITSs can be connected to the same
source and a CITS can also be connected to several other
CITSs. Regardless of the number of CITS, Master-Slave
Cascading Replication arranges them in a tree with the
source as the root, CITS as inner nodes, and CETs as the
leaf nodes.
For discussion purposes, it is helpful to define a few
more specific terms for the replication links in a topology:
Initial link. The initial link connects a
source to a CITS.
Intermediary link. The intermediary link
connects a CITS to another CITS.
End link. The end link connects a CITS to
a CET.
The characteristics of the end links are the same as if
the targets were connected to the source directly. This
means that the end links can be configured for full or
incremental replication depending on the requirements, and
that they can start a transmission immediately after every
transaction, periodically, or on demand.
Hint: The addition of CITS to the replication
topology, however, impacts the service level offered to
the CETs. The initial and intermediary links must
transmit any data or changes early enough for any of the
following intermediary or end links. Thus, it is common
practice to design an immediate replication here. If all
end links only do periodic or on-demand replication, a
periodic replication on the initial and intermediary
links would be sufficient. For these reasons, you should
not design an on-demand replication on an initial or
intermediary link, because the timeliness of some of the
CITS and their corresponding targets would depend on a
user or operator starting the transmission.
The choice of the replication frequency also impacts
the choice of the replication refresh policy. If the
initial and intermediary links have been configured for
immediate replication, you will have to use incremental
replication to transmit only the changes. Incremental
replication is also generally the best choice to
transmit changes for periodic replication at the initial
and intermediary links. If the replication sets are
small enough, another option is to use a snapshot
replication on the initial and intermediary replication
links.
Next Considerations
To design a Master-Slave Cascading Replication
topology for your environment, you must do the following:
Determine the number of CITS to use.
Design the replication links from the source to the CITS
and from the CITS to the CETs.
Determine how much data is required for each CITS.
Define the data structure of the CITSs.
Define the manipulation in each replication link.
The following sections explore these issues.
Number of CITSs
A single CITS removes most of the load from the source
database server because there is only a single replication
link from the source to the CITS. Thus, the only overhead to
the source is that single replication link.
However, if you design just a single CITS, you introduce
two new single points of failure: the CITS and the
additional replication link. An additional CITS helps to
mitigate this effect because you can design an alternative
chain of replication links from the source to each of the
targets. Although the additional replication links that are
now connected to the source cause a slight increase in
replication overhead compared to a single replication link,
the overall availability increases because the alternative
chain acts as a backup to the standard chain.
Figure 3: Master-Slave Cascading Replication with an
alternative chain (dotted arrows)
Hint: After you have two (or more) CITSs
connected to the source, you can connect parts of the
CETs to each of them. This achieves some load balancing
on the CITSs because every CITS serves fewer CETs. In
case of a failure, the CETs are served by one of the
remaining CITSs.
The replication links to both CITSs must transmit the
same replication set. Additionally, the CITSs must not be
written to by any process but the replication link from the
source. If one of these conditions fails, the CITSs could
have different data. In that case, they would not be able to
serve as substitutes for each other.
More CITSs can also be added if a single CITS cannot
serve all CETs. Adding CITSs also increases scalability
because you can add new CITSs to accommodate a growing
number of CETs. If the number of CITSs consequently impacts
the source in an unsustainable way, you can even add another
layer of CITSs. This increases the chain length by one
replication link, but again frees the source database server
from the additional load.
Hint: Adding CITSs can also help you optimize
for different replication characteristics because the
CITSs can be structured in different ways. If, for
example, some of the CETs require snapshot replication,
while others require incremental replication, you can
optimize the structure of one of the CITS for storing
change data and the other one for storing the data
itself. The CETs requesting the changes will connect to
the first CITS, while the CETs requesting snapshots will
connect to CETs that are optimized for the snapshots.
Generally, you should look for clusters of CETs with
similar replication characteristics and then design a
dedicated CITS for each of these clusters. Thereafter,
you can optimize every CITS to best support the
replications links that have similar characteristics.
Limiting the Number of Replication Links
You must have at least one chain of replication links
between the source and every CET to transmit the data or its
changes. As described earlier, you can design an alternative
chain of replication links from the source to every target
to achieve higher availability for the whole system. Do not
overdo it by designing too many alternative chains, however,
because the additional replication links increase the load
on the source. It is best to design at most one standard
chain of replication links plus one alternative chain.
Furthermore, designing additional replication links should
be reserved for when you feel that normal data availability
techniques, such as clustering, storage area networks, or
hot standbys, are not suitable.
Amount of Data for Each CITS
The source replication set which is stored on each CITS
must satisfy the requirements of all the CETs connected.
Thus, the amount of data stored on each CITS is the logical
union of data requested by any of its CETs and the type of
replication being used.
Data Structure for Each CITS
To determine the data structure for each CITS involves,
choose one of the following design options:
Matching the data structure of the CITS to the
source. This enables the movement of data from the
source to the CITS without any additional manipulation
overhead. This design is important if the main goal of your
cascading replication is to remove any avoidable load from
the source.
Matching the data structure of the CITS to the CET
superset. In this case, the manipulation is performed
only once, namely within the replication link from the
source to the CITS. The targets can be fed easily by the
contents of the CITS. This provides a higher overall
efficiency with the tradeoff of some impact on the source
that could have been avoided.
Designing a data structure that differs from both the
source and the CETs. If all replication links to the
CETs perform incremental replication, the CITSs do not have
to store the data - only the changes. In this case, the data
structure of the CITSs can be designed for the storage of
changes only.
Examples
The following examples present two possible
configurations of Master-Slave Cascading Replication.
Different Lengths of Replication Chains
This first example assumes that you have a single source
and a large number of CETs. A small number of the CETs
receive snapshots, while the others are served by
incremental replication. The snapshot replication is
transmitted by way of a single CITS. The number of CETs
served by incremental replication is too large to be served
by a single CITS, however. To minimize the impact on the
source, you could design two levels of CITS, each with a
single CITS that is connected to the source. Figure 4 shows
the resulting replication topology where thick arrows
represent replication links with snapshot replication and
thin arrows represent replication links with incremental
replication.
Figure 4: Master-Slave Cascading Replication topology
with different chain lengths
Two Sources and Conflict Detection and Resolution
Figure 5 shows a replication topology in which a CET
participates in two master-slave cascading replications.
Figure 5: Master-Slave Cascading Replication from two
sources
If the replication sets of Source 1 and Source 2 do not
intersect, then replication from Source 1 by way of CITS 1
to CET 2 always affects different records than those from
Source 2 by way of CITS 2. Thus, no special attention is
required in CET 2 to handle both replication chains.
However, if the replication sets of Source 1 and Source 2
do intersect, the same CET 2 record can be affected by both
the replication from Source 1 through CITS 1 and Source 2
through CITS 2. Resolving the discrepancy requires the
ability to detect and resolve conflicts in CET 2. The same
applies if two or more sources feed the same CITS.
Note: The conflict detection and resolution is
not triggered by updates having occurred at the target,
which is why this is not a master-master pattern. In
this case, the trigger is that different updates
occurred at two sources. However, the concepts described
in
Master-Master Replication still apply to
solving this problem.
Resulting Context
This pattern inherits the benefits and liabilities from
the Data Replication pattern and has the following
additional benefits and liabilities:
Benefits
Source is freed from most of the replication load.
This is the most important benefit of a Master-Slave
Cascading Replication. Only the first replication link
adds load to the source. The remaining replication links do
not burden the source. The CITS generally should not serve
any applications so that conflicting operational demands
between applications and replication services can be
avoided.
CETs can be relatively autonomous. Using a CET is
a good way to provide data to other organizations because
you can pass raw data on to the organizations and they can
use the data however they want. Because you cannot force
another organization to pull the data frequently, though,
this could impact your source database system (for example,
if the organization connected to your database directly).
Master-Slave Cascading Replication liberates the source
from this impact; a CITS is more appropriate to handle the
impact because it does not serve any applications.
Adding more targets does not impact the source.
As your business requires more CETs you can add them without
overburdening the source.
Liabilities
Increased latency. Because the chains from the
source to the targets are longer compared to direct
replication, the delays in getting the replication set to
the CETs can increase. Most implementations of this pattern
use an immediate replication on the replication links to
minimize this liability.
Potential for decreased availability. The longer
chains from the source to the targets have an impact on the
overall availability as well. As the number of links in the
chain increases, the opportunity for failures increases. You
can address this liability by adding a second CITS and
alternative chains in case of failures. A second CITS also
offers the opportunity for load balancing by connecting half
of the targets to each of the CITSs.
Additional administration and management.Master-Slave
Cascading Replication adds databases and replication
links that must be administrated and managed. The whole
replication environment should be controlled by management
tools for an automatic surveillance of the ongoing
operation.
Extra storage cost. The CITS will add storage
requirements to the overall environment.
Additional change management. Structural changes
to the source or the CETs require more attention because the
CITS have to be adjusted appropriately. You should precisely
plan and design the changes on all affected databases.
Operational Considerations
By applying Master-Slave Cascading Replication,
most of the replication overhead is loaded on the CITS.
Hence, it is common practice that the CITSs do not serve any
applications. Instead, the applications are connected to the
source and the targets only. All applications requiring
write access of the database must be connected to the
source.
Related Patterns
For more information, see the following related patterns:
Patterns That May Have Led You Here
Move Copy of Data. This is the root
pattern of this cluster. It presents the fundamental data
movement building block that consists of source, data
movement set, data movement link, and target. Transmissions
in such a data movement building block are done
asynchronously (or eventually) after the update of the
source. Thus, the target applications must tolerate a
certain amount of latency until changes are delivered.
Data Replication. This pattern presents
the architecture of a replication.
Master-Slave Replication. This pattern
presents the solution for a replication where the changes
are replicated to the target without taking changes of the
target into account. It will eventually overwrite any
changes on the target.
Patterns That You Can Use Next
Implementing Master-Slave Transactional Incremental
Replication Using SQL Server.
Other Patterns of Interest
Master-Slave Snapshot Replication. This
pattern presents a solution that transmits the whole
replication set from the source to the target on each
transmission.
Master-Slave Transactional Incremental Replication.
This pattern presents a solution that transmits only the
changes from the source to the target on a
transaction-by-transaction basis.
|