Log Reader Agent Algorithm
Download
Report
Transcript Log Reader Agent Algorithm
DAT 412:Transactional
Replication Internals
Matt Hollingsworth
Program Manager
Microsoft Corporation
Agenda
•
•
•
•
•
•
•
•
Review
General Agent Overview
Snapshot Agent
Log Reader Agent
Distribution Agent
Queue reader Agent
Other Replication Agents
Summary
Transactional Replication
Snapshot
Agent
\\Snapshot Share
Publishing
database
Distribution
Agent
DB
log
•
•
•
•
Log Reader
Agent
Subscribing
database
Distribution
database
Snapshot Agent writes initial data to Snapshot Share
Publisher changes tracked and extracted by Log Reader
Agent
Distribution database (store) acts as reliable queue
Distribution Agent forwards changes to Subscribers
Transactional Replication
Updating subscribers
Publisher
2PC, RPC
Distributor
Queue
Queue Reader Agent
NETWORK
Queue
Immediate Updating
Subscriber
Read-Only
Subscriber
Queued Updating
Subscriber
Agenda
•
•
•
•
•
•
•
•
Review
General Agent Overview
Snapshot Agent
Log Reader Agent
Distribution Agent
Queue reader Agent
Other Replication Agents
Summary
General Agent Overview
•
•
•
Agents implement the core functions of replication
Initiated via SQLAgent or from command line
Maintain connections to repl server instances
– Connection info supplied on agent command line
– Can use integrated or SQL Server security
•
•
SQL Server security: SecurityMode=0, login and password
Integrated security: Security Mode=1, no login or password
– LoginTimeOut and QueryTimeOut configurable
•
Can be scheduled or run on demand
– Continuous: sleeps for PollingInterval between loops
•
Logging and History
–
–
–
–
HistoryVerboseLevel [1|2|3]
OutputVerboseLevel [0|1|2]
Output
ErrorFile
Agenda
•
•
•
•
•
•
•
•
Review
General Agent Properties
Snapshot Agent
Log Reader Agent
Distribution Agent
Queue reader Agent
Other Replication Agents
Summary
Snapshot Agent Algorithm
•
•
Connects to the Publisher and Distributor
For each article in the publication
– Reads publisher schema metadata
– Generates schema scripts into snapshot
directory according to user article options
(.sch, .idx, .dri files)
– Bulk copies table data into snapshot directory
(.bcp files) according to user defined row and
column filters
•
Updates status for subscription
to active
Snapshot Agent Parameters
• MaxBcpThreads
– Improves parallelism
during BCP out
• Misc.
– ReplicationType =1
(transactional)
– RowDelimiter
– FieldDelimiter
Snapshot Sync Properties
• Controls how the snapshot is generated
and applied
• sp_addpublication
– @sync_method
•
•
sets snapshot format (character or native)
sets snapshot concurrency behavior
– @immediate_sync
•
•
true: subscriber can retrieve the initial data
immediately after subscribing
false: subscriber must wait for next snapshot to
synchronize
– @pre/post_snapshot_script
•
executes user defined custom scripts
Snapshot Scripting Properties
How to configure
• sp_addarticle @pre_creation_cmd
•
•
– none, delete, drop or truncate
sp_addarticle @schema_option
– 0x00
User specified creation script (@creation_script)
– 0x01
Object creation
– 0x02
Custom SP’s
– 0x10
Clustered indexes
– 0x20
Convert UDT’s to base data types
– 0x40
Nonclustered indexes
– 0x80
PK Indexes
– 0x100
Triggers
– 0x200
FK Constraints
– 0x800
Defaults
– 0x1000
Column-level collation
– 0x2000
Extended properties
– 0x4000
UK Constraints
– 0x8000
PK Constraints
UI: configure via article properties page
Agenda
•
•
•
•
•
•
•
•
Review
General Agent Overview
Snapshot Agent
Log Reader Agent
Distribution Agent
Queue reader Agent
Other Replication Agents
Summary
Log Reader Agent Algorithm
• Connects to the publisher and distributor
• Scans the publisher log, building a list of committed tx
which are marked for replication = sp_replcmds.
• For each command in the committed transactions
–
For each publication receiving the command
•
inserts data into the distributor using the user specified article format
and column filter = sp_Msadd_repl_cmds27 if the row matches the
specified user row filter
• Moves replication truncation mark in log = sp_repldone
• Commits transaction batch at distributor
• Logs history messages into the distribution database
Note: Log Reader must run before a concurrent snapshot is available for subscribers
demo
Log Reader Agent
Profiler Trace
Log Reader Agent Parameters
•
Batching
–
ReadBatchSize – max number of transactions read from
publisher log per processing loop
•
–
ReadBatchThreshold – max number of commands read from the
publisher log per processing loop
•
–
always honors a tx boundary
MaxCmdsInTran – max number of commands written to
distributor as a transaction
•
•
always honors a tx boundary
Warning: forces a commit when the limit is reached. This may result
in transactional inconsistency at the subscriber in certain cases.
Other Parameters
–
–
PacketSize
MessageInterval
Command Formats
• The Log Reader writes statements as
“commands” into the distributor
MSrepl_commands table
• The command format is defined by the user
as an article property
• These commands are applied directly to the
subscriber by the distribution agent
• This flexibility is valuable for application
specific processing and performance tuning
“None” Command Format
• Log Reader skips the specified command
type instead of inserting it into the
distribution db
• When to use
– Great for auditing, use “None” for the delete
command
• How to configure
– sp_addarticle @ins/upd/del_cmd = ‘None’
– UI: configure via article properties page
“SQL” Command Format
•
•
Commands stored in distribution DB as SQL
inserts, updates, and deletes
When to use
– Heterogeneous subscribers are present
– Lower distribution performance is acceptable
•
•
Less SQL plan sharing occurs at the subscriber
How to configure
– sp_addarticle @ins/upd/del_cmd = ‘SQL’
– To include column names in insert statements, set
@status = 8
– ODBC parameterization improves performance. Set
@status = 16 or 24
– UI: configure via article properties page
Custom Command Formats
•
Commands stored in distribution db as stored procedure
calls = sp_Msins/upd/del<article>
•
When to use
–
–
–
•
Highest performance distribution
Provides application flexibility since the stored procedure can
be customized
No heterogeneous subscribers are present
How to configure
– Specify @schema_option = 0x02 to generate
subscriber side stored procedures
– Configure @status = 16 to improve performance via
parameterization
– UI: configure via article properties page
‘CALL’ Command Format
• How SP parameters are specified
– Inserts
• all column values
– Updates
• initial pk column values
• all final column values for the updated row
– Delete parameters generated
• pk column values
• When to use
– for highest insert and delete performance
– when custom subscriber logic doesn’t require the
values for columns in deleted rows
• How to configure
– sp_addarticle @ins/upd/del_cmd = ‘CALL’
– UI: configure via article properties page
‘MCALL’ Command Format
•
How SP parameters are specified
–
Inserts
•
–
Updates
•
•
•
–
not supported
When to use
–
–
•
initial pk column values
final column values for the specific columns updated
bitmask indicating which columns were updated
Deletes
•
•
not supported
for highest update performance
when custom subscriber logic doesn’t require the before values for
columns
How to configure
–
–
sp_addarticle @ins/upd/del_cmd = ‘MCALL’
UI: configure via article properties page
‘XCALL’ Command Format
•
How SP parameters are specified
– Inserts
•
not supported
– Updates
•
all initial and final column values
– Deletes
•
•
all initial column values
When to use
– For the most flexible subscriber side custom logic
and transforms
•
How to configure
– sp_addarticle @ins/upd/del_cmd = ‘XCALL’
– UI: article properties page
demo
Browsing Distributor
Commands with
sp_browsereplcmds
User Trigger Considerations
•
•
•
Some publisher statements are applied as different sets
of statements at the subscriber
Concurrent snapshot processing applies compensating
commands after snapshot is delivered to subscriber
By default, SQL Server logs updates to unique
constraints in replicated tables as as deletes followed by
inserts
–
–
–
SQL Server 2000 SP1 added new dbcc option that converts
single row unique constraint updates from delete/insert pairs
back into single update statements for the subscriber
This trace flag is not available for queued and immediate
updating subscribers
How to configure
•
•
dbcc traceon(8207,-1)
Note: updates to columns in unique keys that affect multiple rows
are always applied to the subscriber as deletes followed by inserts.
Stored Proc Considerations
•
Two SP publishing methods exist:
1. Replicate the rows affected by the SP
2. Replicate the execution of the SP
•
When to use
–
•
Publish execution for SP’s that affect large numbers of rows.
Provides significant performance improvement (10x).
How to configure
–
–
–
–
–
sp_addarticle @type
value of ’proc exec’ replicates via RPC
value of ’serializable proc exec’ replicates via RPC if publisher
call is within a serializable transaction
other values will replicate the individual rows affected by the
stored procedure
UI: article properties page
View Considerations
• Standard Views
– Replicated as scripted schema object
definitions
• Requires publishing the underlying table(s)
• Indexed Views
– Replicated as scripted schema object
definitions
– Replicated as a base object like a table
• only the changes to the indexed view data are
replicated
• underlying table(s) do not need to be published
Text/Image Considerations
• Size
– Server option ‘max text repl size’ sets the
largest text/image value on the replicated table
– Data beyond the max size is not allowed
• Error: Length of text data 1000 to be replicated
exceeds configured maximum: 500."
• Text/Image Update Limitations with Repl
– No updates during concurrent snapshot
– Repeatable read tx isolation level or greater
required when using text pointers
– XCALL before image not supported
– No updates at subscriber for text/image
Loopback Considerations
• Includes originating server instance and database
with each distributor command to prevent looping
command distribution scenarios
• When to use
– Necessary for Bi-directional (reciprocal) topologies
where each database publishes and subscribes to the
other
• How to Configure
– sp_addsubscription @loopback_detection = ‘true’ for
both subscriptions
Backup Considerations
•
Publisher restored to previous point-in-time
–
–
–
•
Distributor restored to previous point-in-time
–
–
–
•
Problem: distributor and subscriber may contain rows later than
restored database!
Solution: set “sync with backup” option on publisher to prevent
commands from inserting into distributor until the log entries are
backed up
Side Effect: increases command distribution latency
Problem: Distributor loses commands that have not been delivered to
subscribers when they are already truncated from publisher log!
Solution: set “sync with backup” option on distributor to prevent
publisher log truncation until the distributor has backed up entries
Side Effect: may increase size of publisher log
Subscriber restored to previous point-in-time
–
No issues if within the distributor retention interval since transaction
timestamp on subscriber is also restored
Agenda
•
•
•
•
•
•
•
•
Review
General Agent Overview
Snapshot Agent
Log Reader Agent
Distribution Agent
Queue reader Agent
Other Replication Agents
Summary
Distribution Agent Algorithm
• Connects to the distributor, subscriber, and
snapshot share
• Applies scripts and bcp’s snapshot data into
subscriber. Depends on status and
sp_addsubscription @sync_type
• Gets next batch of commands for the subscription
from the distributor MSrepl_commands table =
sp_msget_repl_commands
– Determines last applied xact id for subscription by
querying MSreplication_subscriptions
transaction_timestamp
• Applies the literal commands directly to the
subscriber
• Commits the batch
demo
Distribution Agent
Profiler Trace
Distribution Agent Parameters
• Performance related parameters
–
–
–
–
–
–
BcpBatchSize
CommitBatchSize
CommitBatchThreshold
MaxBcpThreads
MaxDeliveredTransactions
UseInprocLoader
–
–
–
–
–
–
–
–
MessageInterval
TransactionsPerHistory
NoTextInitOnSync
ProfileName
DefinitionFile
Publication
QuotedIdentifier [0|1|2]
Note: FTP parameters are obsolete
• Misc parameters
Distribution Tuning Tips
•
Offload distribution agent load
–
•
Tune databases
–
–
•
Minimize indexes on subscriber
Tune I/O for subscriber and distributor: RAID 0+1 for data,
separate log
Increase Batching
–
•
Configure pull subscriptions
Increase CommitBatchSize, CommitBatchThreshold, and
MaxBcpBatchSize parameters
Increase agent parallelism
–
–
–
Use independent agent: sp_addpublication
@independent_agent =’true’
Configure multiple publications for each subscriber according to
transactional requirements
Increase MaxBcpThreads parameters
Scripting Replication Stored Procs
• To manually generate subscriber replication procs
–
–
–
–
exec sp_helparticle (to get article id)
exec sp_scriptinsproc
exec sp_scriptupdproc
exec sp_scriptdelproc
• To manually generate queued conflict tables and
triggers
– sp_makeconflicttable
– sp_addsynctriggers
Scripting Replication Stored Procs(2)
• Useful for manually
generating an update proc
that is optimized for tables
with large numbers of
indexes
– Exec Sp_scriptdynupdproc
Subscriber
Procedure
Indexes on subscriber table
Throughput,
cmds/sec
Default MCALL
1 clustered index
1,570
1 clustered index, 10 nonclustered
indexes
110
1 clustered index, 10 nonclustered
indexes
1,180
Dynamic MCALL
Agenda
•
•
•
•
•
•
•
•
Review
General Agent Overview
Snapshot Agent
Log Reader Agent
Distribution Agent
Queue reader Agent
Other Replication Agents
Summary
Queued Subscriber Trigger Logic
• SQL DML modification to subscription table
occurs
• Replication trigger fires and inserts data
into local subscriber queue
– Queue can be a SQL Server table
(msreplication_queue) or MSMQ
• Queue Reader Agent will later collect these
changes from the queue and apply to the
publisher
Queue Reader Agent Algorithm
• Connects to the distributor, subscriber, and publisher
• For each subscription serviced by the distributor
– Gets next batch of commands from the subscriber queue =
MSreplication_queue via stored procedure = sp_replsqlqgetrows
– Applies commands to the publisher table
sp_MSsync_ins/upd/del_<table>_#
– Note: manages an internal thread pool to keep CPU’s busy and
increase parallelization. Groups approximately 1000 commands
per batch.
• Commits the publisher rows
• Conflicts detected via msrepl_tran_version
uniqueidentifier column differences
• Note: ranged updates (update foo set x=x+1) to unique
key columns will fail at the publisher since they are
applied as individual update statements.
demo
Queue Reader Agent
Profiler Trace
Queue Reader Agent Parms
•
•
You can configure who wins in case of
conflicting updates
ResolverState [1|2|3]
1. Publisher wins conflict
2. Publisher wins conflict with reinitialization of
subscriber
3. Subscriber overwrites existing Publisher
values
Administering the Queues
• sp_getqueuedrows
– Returns the transactions that are stored in the
queue
• sp_replqueuemonitor
– Returns the commands that are stored in the
queue
• sp_setreplfailovermode
– Switches between immediate updating and
queued updating modes
Agenda
•
•
•
•
•
•
•
•
Review
General Agent Overview
Snapshot Agent
Log Reader Agent
Distribution Agent
Queue reader Agent
Other Replication Agents
Summary
Distributor Cleanup Agents
• Distribution Cleanup
– Deletes rows from the distribution Msrepl_cmds and
Msrepl_transactions table which are outside the distribution
database retention period
– sp_adddistributiondb @max_distretention,
sp_MSdistribution_cleanup
• History and Logging Cleanup
– Deletes rows from each distributor agent history table which are
outside the history retention period
– sp_adddistributiondb @max_historycleanup,
sp_MShistory_cleanup @history_retention
• Expired Subscription Cleanup
– Removes subscriptions which have not synchronized within the
publication retention period
– sp_addpublication @retention
– sp_expired_subscription_cleanup
Other Replication Agents
• Replication Checkup
– Raises error 14151 for all agents that have not logged
messages within the specified interval
– sp_replication_agent_checkup @heartbeat_interval
• Re-init Subs with Data Validation Failures
– Marks all subscriptions for re-initialization that have
failed a validation check
– sp_MSreinit_failed_subscriptions
Agenda
•
•
•
•
•
•
•
•
Review
General Agent Overview
Snapshot Agent
Log Reader Agent
Distribution Agent
Queue reader Agent
Other Replication Agents
Summary
Replication Summary
• Fast
– Can deliver 2500 commands per second
– Latency is often a few seconds
• Flexible
– Modular agent approach maximizes application
flexibility
– Stored procedures and DMO permit finegrained control
• Consistent
– Data is kept transactionally consistent
Other Resources
• DAT411 Programming and Deploying
Microsoft SQL Server 2000 Replication:
Lessons Learned
– Thursday 15:00 - 16:15, Room 3
• Transactional Replication Performance
Tuning whitepaper
– http://www.microsoft.com/SQL/techinfo/adminis
tration/2000/ReplPerf.asp
Don’t forget to complete the
on-line Session Feedback form
on the Attendee Web site
https://web.mseventseurope.com/teched/
© 2002 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.