Transcript PPT

Publish / Subscribe
Database Log Shipping over
Bittorent P2P
CS 848
Fall 2006
Univeristy of Waterloo
Project Presentation by N. Tchervenski
Intro
 Implemented a tool to facilitate publish /
subscribe of databases.
 Technologies used:
 Log shipping
 Bittorrent
 RSS
Motivation
 Looking for an easy and quick way to create
read-only replicated databases using minimum
new infrastructure and minimum overhead
 Instead of keeping a standby replica, can use it
for queries
 Log shipping can be performed on many of the
popular DB systems – DB2, Oracle, MS SQL
Server, Postgres, Teradata, etc.
 Transferring large amounts of data can be done
using P2P like Bittorrent
Architecture
DB Server
Archived Logs &
backup images
Archived logs
directory
RSS Feeds of tracker data
Publishing
Tool
BitTorrent seeder
Internet
RSS client
DB Log
commands
management
tool
Subscription
Tool
DB Replica
Db restore and
rollforward
Archived logs
directory
BitTorrent client
Features
 Minimum impact on the server
 No need to capture data
 Can be part of regular backup / replication process
 Can send data to as many or as few peers as needed
 Log shipping is popular – existing scripts and
infrastructure can be reused
 Sharing through Bittorrent is flexible – can limit
upload speed, number of connections, disable
IPs, etc.
Current Limitations
 Database backups are not cross-platform / cross-database-version
portable
 Moving the whole database, rather than just the data
 need similarly configured machines (access control, paths, etc. )
 Delay when bringing up the database up after rollforward ( index
rebuilding, etc. ). To include new logs, need to rollback and then
rollforward again – this cannot be done too often.
 Not suitable for databases with lots of updates
 When LOAD is done (DB2), tablespace backup needs to be
provided or data location be available to the remote DB
 Security
 Authorization to download
 Bittorrent transfers can be slowed down by malicious peers sending
garbage data
Related Work
 DPROPR - IBM DataPropagator
Relational
 Clients subscribe to particular rows / columns
of tables
 Can receive full refresh or just updates
 For updates only mode, capture control tables
are used
Testing
 Testing and implementation is done using
 DB2 V9
 Linux – Ubuntu
 Bittorrent client – Enchanced CTorrent
Conclusion
 Based on gluing together existing




technologies
A way to use standby replica
Legitimate use of BitTorrent 
Hope this will stir more related research
Ideal for public databases
References
 [1] DPROPR Planning and Design Guide,
http://www.redbooks.ibm.com/abstracts/sg24477
1.html
 DB2 Replication Guide and Reference,
ftp://ftp.software.ibm.com/ps/products/db2/info/vr
82/pdf/en_US/db2e0e82.pdf
 Warm Standby Servers for High Availability,
http://www.postgresql.org/docs/8.2/static/warmstandby.html